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Preface 



I would like to welcome all the participants to the 3rd International Conference 
on Information Security and Cryptology (ICISC 2000). It is sponsored by the 
Korea Institute of Information Security and Cryptology (KIISC) and is being 
held at Dongguk University in Seoul, Korea from December 8 to 9, 2000. This 
conference aims at providing a forum for the presentation of new results in 
research, development, and application in information security and cryptology. 
This is also intended to be a place where research information can be exchanged. 

The Call for Papers brought 56 papers from 15 countries and 20 papers will 
be presented in five sessions. As was the case last year the review process was 
totally blind and the anonymity of each submission was maintained. The 22 TPC 
members finally selected 20 top-quality papers for presentation at ICISC 2000. 

I am very grateful to the TPC members who devoted much effort and time 
to reading and selecting the papers. We also thank the experts who assisted the 
TPC in evaluating various papers and apologize for not including their names 
here. Moreover, I would like to thank all the authors who submitted papers to 
ICISC 2000 and the authors of accepted papers for their preparation of camera- 
ready manuscripts. Last but not least, I thank my student, Joonsuk Yu, who 
helped me during the whole process of preparation for the conference. 

I look forward to your participation and hope you will find ICISC 2000 a truly 
rewarding experience. 



December 2000 



Dongho Won 




Organization 



The 3rd International Conference on Information Security and Cryptology 
(ICISC 2000) is organized and sponsored by the Korea Institute of Information 
Security and Cryptology (KIISC). 



Executive Committee 



Kil Hyun Nam 
Dong Ho Won 
Jae Ho Shin 



General Chair (Presient of KIISC, Korea) 

TPC chair (Sungkyunkwan University, Korea) 
Organizing Chair (Dongguk University, Korea) 



Technical Program Committee 



Dong Ho Won 


Sungkyunkwan University, Korea 


Zongduo Dai 


Academia Sinica, P.R.C. 


Ed Dawson 


Queensland University of Technology, Australia 


Chul Kim 


Kwangwoon University, Korea 


Kwang Jo Kim 


Information and Communications University, 
Korea 


Kaoru Kurosawa 


Tokyo Inst, of Tech., Japan 


Kwok-Yan Lam 


National University of Singapore, Singapore 


Kyoung Goo Lee 


KISA, Korea 


Pil Joong Lee 


Pohang University of Sci. and Tech., Korea 


Chae Hoon Lim 


Future System Inc., Korea 


Masahiro Mambo 


Tohoku University, Japan 


Jong In Lim 


Korea University, Korea 


Chris Mitchell 


University of London, U.K. 


Sang Jae Moon 


Kyungpook National University, Korea 


Kaisa Nyberg 


Nokia Research Center, Finland 


Eiji Okamoto 


Toho University, Japan 


Tatsuaki Okamoto 


NTT, Japan 


Choon Sik Park 


ETRI, Korea 


Sung Jun Park 


BCQRE CO., LTD, Korea 


Bart Preneel 


Katholieke Universiteit Leuven, Belgium 


Heung Youl Youm 


Soonchunhyang University, Korea 


Moti Yung 


CertCo, U.S.A. 


Yuliang Zheng 


Monash University, Australia 




VIII Organization 



Organizing Committee 

Jae Ho Shin 
Ghee Sun Won 
Sang Kyu Park 
Ha Bong Chung 
Dong Hoon Lee 
Jae Jin Lee 
Howang Bin Ryou 
Seok Woo Kim 
Yong Rak Choi 
Hyun Sook Cho 
Hong Sub Lee 
Seung Joo Han 
Min Surp Rhee 
Seog Pal Cho 
Kyung Seok Lee 
Joo Seok Song 
Jong Seon No 
Tai Myoung Chung 

Sponsoring Institutions 



Dongguk University, Korea 
Dongguk University, Korea 
Hanyang University, Korea 
Hongik University, Korea 
Korea University, Korea 
Dogguk University, Korea 
Kwangwoon University, Korea 
Hansei University, Korea 
Taejon University, Korea 
ETRI, Korea 
KISA, Korea 
Chosun University, Korea 
Dankook University, Korea 
Seonggul University, Korea 
KIET, Korea 
Yonsei University, Korea 
Seoul National University, Korea 
Sungkyunkwan University, Korea 



MIC (Ministry of Information and Communication), Korea 
HTA (Institute of Information Technology Assessment), Korea 
KISA (Korea Information Security Agency, Korea 
Dongguk University, Korea 
The Electronic Times, Korea 




Table of Contents 



A Note on the Higher Order Differential Attack of Block Ciphers with 

Two-Block Structures 1 

Ju-Sung Kang, Seongtaek Chee, Choonsik Park (Department of Basic 
Technology, NSRI, Korea) 

On the Strength of KASUMI without FL Functions against Higher Order 

Differential Attack 14 

Hidema Tanaka, Chikashi Ishii, Toshinobu Kaneko (Science University 
of Tokyo, Japan) 

On MISTYl Higher Order Differential Cryptanalysis 22 

Steve Babbage (Vodafone Ltd, England), Laurent Frisch (France Telecom 
Recherche et Developpement, France) 

Difference Distribution Attack on DONUT and Improved DONUT 37 

Dong Hyeon Cheon, Seok Hie Hong, Sang Jin Lee ( Center for 
Information and Security Technologies, Korea University, Korea), 

Sung Jae Lee, Kyung Hwan Park, Seon Hee Yoon (Korea Information 
Security Agency, Korea) 

New Results on Correlation Immunity 49 

Yuliang Zheng (Monash University, Australia), Xian- Mo Zhang 
(The University of Wollongong, Australia) 

Elliptic Curves and Resilient Functions 64 

Jung Hee Cheon (Mathematics Department, Brown University, USA, 
and Securepia, Korea), Seongtaek Chee (NSRI, Korea) 

Fast Universal Hashing with Small Keys and No Preprocessing: The PolyR 

Construction 73 

Ted Krovetz (Department of Computer Science, University of 
California, USA), Phillip Rogaway (Department of Computer Science, 
Chiang Mai University, Thailand) 

Characterization of Elliptic Curve Traces under FR-Reduction 90 

Atsuko Miyaji, Masaki Nakabayashi, Shunzo Takano (Japan Advanced 
Institute of Science and Technology, Japan) 

A Multi-party Optimistic Non-repudiation Protocol 109 

Olivier Markowitch, Steve Kremer ( Computer Science Department, 
Universite Libre de Bruxelles, Belgium) 

Secure Matchmaking Protocol 123 

Byoungcheon Lee, Kwangjo Kim (Information and Communications 
University, Korea) 




X 



Table of Contents 



An Improved Scheme of the Gennaro-Krawczyk-Rabin Undeniable 

Signature System Based on RSA 135 

Takeru Miyazaki (Kyushu Institute of Technology, Japan) 

Efficient and Secure Member Deletion in Group Signature Schemes 150 

Hyun-Jeong Kim, Jong In Lim, Dong Boon Lee ( Center for 
Information Security Technologies, Korea University, Korea) 



An Efficient and Practical Scheme for Privacy Protection in the 

E-Gommerce of Digital Goods 162 

Feng Bao, Robert B. Deng, Peirong Feng (Kent Ridge Digital Labs, 
Singapore ) 

An Internet Anonymous Auction Scheme 171 

Yi Mu, Vijay Varadharajan (School of Computing and IT, University 
of Western Sydney, Australia) 

Efficient Sealed-Bid Auction Using Hash Ghain 183 

Koutarou Suzuki, Kunio Kobayashi, Bikaru Morita (NTT Laboratories, 

Japan ) 

Micropayments for Wireless Gommunications 192 

Dong Cook Park (Access Network Laboratory, Korea Telecom, Korea 
and Queensland University of Technology , Australia), Colin Boyd, 

Ed Dawson (Information Security Research Centre, Queensland 
University of Technology, Australia) 



Gryptographic Applications of Sparse Polynomials over Finite Rings 206 

William D. Banks (Department of Mathematics, University of 
Missouri, USA), Daniel Lieman (Department of Mathematics, 

University of Ceorgia, USA ), Igor E. Shparlinski, Van Thuong To 
(Department of Computing, Macquarie University, Australia) 

Efficient Anonymous Fingerprinting of Electronic Information with 

Improved Automatic Identification of Redistributors 221 

Chanjoo Chung, Seungbok Choi, Dongho Won (School of Electrical and 
Computer Engineering, Sungkyunkwan University, Korea), 

Youngchul Choi (BCQRE Co., Korea) 



Hash to the Rescue: Space Minimization for PKI Directories 235 

Adam Young ( Columbia University, USA ), Moti Yung ( CertCo, USA ) 

A Design of the Security Evaluation System for Decision Support in the 

Enterprise Network Security Management 246 

Joe Seung Lee, Sang Choon Kim, Seung Won Sohn (Information 
Security Technology Division, ETRI, Korea) 



Author Index 



261 




A Note on the Higher Order Differential Attack 
of Block Ciphers with Two-Block Structures 



Ju-Sung Kang, Seongtaek Chee, and Choonsik Park 



Section 8100, Department of Basic Technology, NSRI 
161 Kajong-Dong, Ynsong-Gn, Taejon, 305-350, KOREA 
{ j skang, chee , cspjOetri . re .kr 



Abstract. We study on the security against higher order differential 
attack on block ciphers with two-block strncture which have provable 
security against differential and linear cryptanalysis. The two-block 
structures are classified three types according to the location of round 
function such as C(Center)-type, R(Right)-type, and L(Left)-type. We 
prove that in the case of 4 rounds encryption function, these three types 
provide an equal strength against higher order differential attack and 
that in the case of 5 or more rounds, R-type is weaker than C-type 
and L-type. Moreover, we show that these facts also hold similarly for 
probabilistic higher order differential attack. 

Keywords: DC, LC, provable security, (probabilistic) higher or- 
der differential attack, two-block structure 



1 Introduction 

The most well-known methods of analyzing block ciphers today is differential 
cryptanalysis(DC) and linear cryptanalysis (LC). DC, proposed by Biham and 
Shamir[2,3] in 1990, is a chosen plaintext attack in which the attacker chooses 
plaintexts of certain well-considered differences. LC, published by Matsui[ll] in 
1993, is a known plaintext attack. In the attack based on LC, the attacker finds 
some effective linear expressions which are called linear approximations. 

There are some measures in order to evaluate the security of block ciphers 
against DC and LC. When DC was proposed for the first time, the security 
of given block ciphers was evaluated by its maximum differential characteris- 
tic probability, which was found heuristically. However, Lai et al.[10] pointed 
out that block cipher designers should use the maximum average of differen- 
tial probability instead of the maximum differential characteristic probability 
for evaluating security against DC. This situation of LC is similar to that of 
DC. When LC was first presented, the security against LC was evaluated by the 
maximum linear approximation probability. Nyberg[15] claimed that designers 
should use the maximum average of linear approximation probabilities, which is 
called the linear hull probability, instead of the maximum linear approximation 
probability. 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 1-13, 2001. 
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Nyberg and Knudsen[16] showed that Feistel ciphers are provably secure 
against DC and LC, which means that the upper bounds of the maximum average 
of differential and linear probabilities of them are sufficiently small. Moreover 
they proposed the block cipher which is called /CAf cipher and provably secure 
against DC and LC. The notion of provable security against DC and LC was 
also researched by Matsui[12], Aoki and Ohta[l], and Kaneko et al.[7]. 

Meanwhile, block ciphers with provable security against DC and LC do 
not guarantee their security against other attacks. For example, Jakobsen and 
Knudsen[6] showed that the /CAf cipher can be broken by the higher order dif- 
ferential attack with much less complexity than DC and LC. The complexity of 
DC (or LC) for the ICAf cipher with 6 rounds is about 2®°, but that of higher 
order differential attack is only about 2®. Therefore, it is clear that higher order 
differential attack provides a security evaluation aspect different from that of 
provable security against DC and LC. 

In this work we investigate the relationship between the security of block 
ciphers against higher order differential attack and the overall structures of block 
ciphers. The overall structures of block ciphers with provable security against DC 
and LC are mainly two-block structures in which the plaintext is divided into two 
equal sub-blocks and one sub-block is transformed by a round function and then 
exclusive-ored with the other sub-block in each round. The two block structures 
are classified as three types according to the location of round function such as 
C(Center)-type, R(Right)-type, and L(Left)-type. We prove that in the case of 4 
rounds encryption function, these three types provide the equal strength against 
higher order differential attack and that in the case of 5 or more rounds, R-type 
is weaker than C-type and L-type. Further, we also discuss the security of block 
ciphers with two-block structures against probabilistic higher order differential 
attack [4]. 

2 Two-Block Structures with Provable Security against 
DC and LC 

Let A be a round function of a block cipher with n input and output bits. That 
is, F : — >■ Z^. For any given Ax^ Ay, a, b G Z^, define differential and 

linear probability of F by 

DpP{Ax -G Ay) = #{^€^2 : F{x) ® F{x ® Ax) = Ay} 

and 

Lpq„ ^ s) = : <g,»>=<i..rw>} _ _ 

respectively, where < a, (3 > denotes the parity(0 or 1) of bitwise product of a 
and (3. DP^ and LP^ for a strong cryptographic function F should be small 
for any Ax yf 0 and 6 yf 0. So we define parameters represent immunity of F 
against DC and LC as follows: 

DPma^ = DP^(Ax — >■ Ay) 

““ Ax^O, Ay ^ 
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and 

= max LP^ {a b) . 

a, o^O 

Assume that is a key-dependent encryption function with 2n input and 
output bits and K be the set of all possible key values. For any k G K, 
denotes an one- variable function from to with the fixed key k. Then we 
can define ' {Ax -G Ay) and LP^^ \a ^ b) alike to the definitions for F, 

where Ax, Ay, a, b G Z^"'. The averages of differential and linear probability 
of E are defined by 

DP^{Ax -G- Ay) = — ^ DP^^'‘\Ax -G Ay) 



and 

LP^{a ^ 6) = 1 ^ Lps'"’ {a^b), 

keK 

respectively. If and are sufficiently small for any Z\x, Ay, a, b G 
such that Ax yf 0 and & 0, we say that E is provably secure against DC and 
LC. Equivalently, E has provable security against DC and LC if 

DPmax DP^ {Ax -G Ay) 

Ax^O, Ay ^ 



and 

^^’max =‘^ m^ LP^{a -A b) 

a, o^O 

are low enough. 

The two-block structures with provable security against DC and LC are clas- 
sified as three types according to the location of round function F. Three possible 
positions of the round function F are expressed in Fig. 1. 




Fig. 1. Possible positions of the round function 
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We say that a two-block structure is C-type if the round function F is situated 
in the center of the two blocks. Similarly, we define that a two-block structure 
is R-type(or L-type) if the round function F is located in the right(or left) hand 
side of the two blocks. See Fig. 2. 




< C-type > 



< R-type > 



< L-type > 



Fig. 2. Three types of two-block structures 



The two-block structure of C-type is well-known as Feistel structure and 
applied to the overall structure of DES. Nyberg and Knudsen[16] proved that 
if ^ P ^ then DP^^^{or of the block cipher with 

C-type structure can be evaluated as < 2p^(or LP^^^ < 2q^), where the 

number of rounds is 4(or 3 if E is bijective) or more. Aoki and Ohta[l] improved 
the results of Nyberg and Knudsen as follows: DP^^^ < and LP^^^ < if 
F is bijective and the block cipher of C-type has more than 3 rounds. 

Matsui[12] introduced another structure of block ciphers with provable se- 
curity against DC and LC which is different from Feistel structure. It was the 
two-block structure of R-type and applied to the overall structure of block cipher 
MISTY[13]. The R-type structure has a merit that it realizes parallel compu- 
tation of the round functions, but its round functions must be a bijection for 
decryption process. Matsui[12] proved that DP^^^ < p^ and LP^^^ < q^ li F 
is a bijection and the block cipher of R-type has more than 3 rounds. That is, 
block ciphers of C-type and R-type have the same provable security against DC 
and LC. 

On the other hand, Kaneko et al.[7] showed that the block cipher of L-type 
and R-type have the same provable security against DC and LC by using the 
duality between probabilities of differential and linear hull. Skipjack [14] is a 
block cipher of four-block structure, but it seems that the overall structure of 
Skipjack is an extension of the L-type two-block structure. 

Three different types, C-type, R-type, and L-type, have the same provable 
security against DC and LC, but this does not guarantee that their security 
against other attacks is also the same. Especially, it seems that their immunities 
against higher order differential attack are not in accord with each other. So we 
investigate the security against higher order differential attack of three different 
types. Throughout this paper we assume that the round functions in the R-type 
and L-type are bijective. 
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3 Algebraic Degree and Higher Order Differential Attack 

Let X = (xi,--- ,x„) G Z2 be an n-dimensional binary vector and f{x) be a 
Boolean function. Then f{x) can be represented as the algebraic normal form: 

f{x) = oo © aixi © • • • © anXn 

©012X1X2 © • • • © 0 „_y„X„_iX„ 



©Oi2...„XiX2---X„ . 

The algebraic degree of /, deg(/), is defined as the degree of the highest degree 
term of its algebraic normal form. If f is a vectorial Boolean function such as 
F{x) = (/i(x), • • • , fm{x)), then the degree of F is defined as 

deg(F) = max deg(/*) . 

l<2<m 

Lai [9] introduced the notion of higher order derivative and proposed some 
useful facts for analyzing block ciphers. 

Definition 1 For a vectorial Boolean function F : Zlf — > Zff, the derivative 
of F at point a € Zf is defined as 

AaF{x) = F{x © o) © F{x) , 

the i-th derivative of F at (oi,--- ,Oi) G {Z^Y is defined by 

Z\it-.a,i"(x) = Z\a. (zii;:.'.\a,_iF(x)) , 

where ^ai-iF{x) is {i — l)-th derivative of F at (oi,--- ,Oi_i) and 0-th 

derivative of F is defined to be F itself. 

Knudsen[8] extended the notion of classical differentials into higher order 
differentials which can be inferred from the definition of higher order derivatives. 

Definition 2 Let F : Ztf — > Z^f be a vectorial Boolean function. The differen- 
tial of order i is an (z + l)-tuple (oi, ■ ■ ■ , Oi, b) € {ZfY x Zff such that 

aY,- , a,F{x) = b , X&Zlf. 

Various higher order differential attacks [6,8, 17, 18] are based on the following 
propositions shown by Lai [9]. 

Proposition 1 

deg(AaF{x)) < deg(F{x)) - 1 . 
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Proposition 2 Let C[ai, ■ ■ ■ , a^] denote the subspace generated by {oi, • • • , Oi}. 
Then 

^a,F{x) = 0 F{x®c). 

cG/C[ai,--- ,aj] 



Proposition 3 If Oi is linearly dependent ofai, ■ ■ ■ , ai-i, then Aa},--- ^a^Ffx) = 

0 . 



Knudsen[8] showed that the attacks on 5 rounds cipher of C-type using higher 
order differentials are much more efficient than conventional DC. Also Jakobsen 
and Knudsen[6] pointed out that the higher order differential attack is effective 
in attacking the /CAf cipher because the algebraic degree of the round function F 
is low. Later, Shimoyama et al.[17] improved this attack by reducing the required 
chosen plaintexts and running time. 



4 Higher Order Differential Attack on Block Ciphers of 
Two-Block Structures 

In this section we investigate the security of two-block structures against higher 
order differential attack. Let x = (x^,x^) be the plaintext, where x^ and x^ 
denote the left and right sub-blocks of x, respectively. Similarly, let 

E{x) = y = {y\y^) = {E\x),E^{x)) 

be the corresponding ciphertext and Xi = {xf , xf ) denote the corresponding 
result of i iterative rounds encryption. We say that the round function is F : 
Ztf — >• Zf with deg{F)=d and 



Fk^{x) = F{x © ki) , 

where ki is the t-th round subkey of n-bit. Further, we assume that d < n and 
round subkeys are always exclusive-ored just prior to the round function E. 

4.1 Higher Order Differential Attack on 4 Rounds Block Ciphers 
with Two-Block Structures 

Firstly, we assume that the number of rounds is 4 in each two-block structure. 
The 4 rounds block cipher of C-type structure is in Fig. 3. We obtain that 

a;3 = © Ef,^ (x^) © E^^ {x^ © E^^ {x^ © , 

x§ = x^ ® Fk^{x^ ® Fk,{x^)) (1) 

from Fig. 3. It suffices to know only the value of x^ in order to analogize the 
value of a; 3 , since x^ = y^ ■ Say 

^3 = ( 2 ^) ® ® Fk^ (x^)) , 
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Fig. 3. C-type structure with 4 rounds 



because depends upon ki and k 2 by (1). By the proposition 1, 2, and 3, for 
any linearly independent vectors oi, • • • , such that aj = (0,aj^) £ Z 2 x Z^, 
1 ^ ^ d, and for some constant c, 

0 T[^,.fc,](a:©u) = c (2) 

v^C[a^,■■■ ,a<j] 

holds for all x £ Z|” and k\,k 2 £ K. Thus we can obtain that 

0 ^[ 0 . 0 ] (^)- ( 3 ) 

«G£[oi,-" ,a<j] 

Moreover in the C-type structure x^ = Fk^{y^) © holds, hence 

T^k,,Ux) = F^,{E\x))(BE\x) . (4) 

Consequently, from (2), (3), and (4), we obtain that 

0 FuAE\v))®E^{v)= 0 T[^,o](u) (5) 

v^C[ax,--- ,ad] v€C[ai,--- ,ad] 

By using (5), the process of recovering the last round subkey value /C4 is as 
follows: 

1. On Z|”, choose linearly independent vectors Oi, • • • , such that 
aj = (0, of) G Z2” xZ^ , VI < j < d . 
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2. Assume the value of k^. 

3. For each v G £[oi, • • • , ad], do the following: 

a) Obtain E^{v) and E^{v). 

b) Compute Tj^pj(w). 

c) Compute Ek^{E^{v)). 

4. If (5) holds, then ^4 is the right key. Otherwise, go to step 2. 

From the above process, we can obtain the fact that the required number of 
chosen plaintexts is 2^^ and the maximum running time is about 4 • 2^^ + 2^*+" 
times of the computation of round function E for finding the last round key. 

On the other hand, 4 rounds R-type and L-type structures are expressed 
in Fig. 4 and 5, respectively. Remember that the round function F should be 
invertible for decryption in R-type and L-type structures. Assume that in the 
last round of R-type and L-type structures, more round subkey is exclusive-ored 
just after the round function F. This assumption is resonable since without such 
a process, the round function F in the last round has no cryptographic meaning. 




Fig. 4. R-type structure with 4 rounds 



On the 4 rounds R-type structure, we obtain that 

= x^ ® Fk^{x^) © Fk^{x^) © Ffc3(a;^ © Fk^{x^)) , 
x^ = x^ ® Ffci {x^) © Fk^ (x^) 



( 6 ) 
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by figure 4. By (6), 

if = FkAx^) ® Fk^x^) © fc4 , 

where and 

0 Fg\E^{v)®E\v))= 0 (7) 

,ad] riG£[oi,-" ,a<j] 

since xf = E^^{y^ © y^). Therefore we know the fact that the required chosen 
plaintexts and running time of C-type and R-type structures are the same. 




Fig. 5. L-type structure with 4 rounds 

Similarly, we can obtain that by Fig. 5, 

3^3 = Fk 2 {x^ © Fk^ (x^)) © Ffc 3 (Ffej (x^) © (x^ © E/,^ (x^))) , 

x§ = Fk, {Fk, (x^) © Fk, {x^ © Fk, (x^))) . (8) 

We must know the value of X3 to recover X3 since x§ = y^ (B y^- At this point, 
it is easy to have misunderstanding that the algebraic degree of the function 
representing X3 is at least cP by taking a quick glance at (8). However in the 
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L-type structure, if we recover then we can also find the value of X 2 

since = Xg © x§. So the attacker can use the formula 

X 2 = Fk^ {x^ © Fk^ (x^)) (9) 

instead of (8). Now let ig = Xg © k^, = ig © x§, and 

^2 = © Fk^{x^)) © fc4 • 

Then by choosing linearly independent vectors Oi, •••, ad such that = 0, 
1 < J < c?, we can obtain the formula which similar to (5) of C-type and (7) of 
R-type that 

0 F^^\E^iv))®E^{v)®E^iv)= 0 T[^,o.o](^^) , 

vGC[ai,--- ,ad] vGC[ai,--- ,ad] 



since 

3:2 = Fg\y^)(By^ (By^ . 

From this, we obtain the fact that the required chosen pliantexts and running 
time for the higher order differential attack of 4 rounds L-type structure are the 
same to those of C-type and R-type structures with 4 rounds. 

Theorem 1 Let the algebraic degree of round function F be deg(F )=d. Then 
the required chosen plaintexts and maximum number of computations of round 
function F for the higher order differential attack of 4 rounds C-type, R-type, 
and L-type structures in order to recover the last round keys are all 2'^ and 
4 • 2*^ + 2'^+”, respectively. 

4.2 Higher Order Differential Attack on 5 or More rounds block 
ciphers with two-block structures 

Now we consider the security of 5 or more rounds of three different types against 
higher order differential attack. Note that we should pay attention to the equa- 
tion of x^ in the 5 rounds C-type and R-type structures in order to analyze the 
security against higher order differential attack. But in the case of 5 rounds L- 
type structure, we must observe that the equation of x^. Concerning equations 
are as follows: 

C-type : xf = © Fk, (x^) © Ffcj (x^ © Fk^ (x^ © Fk^ (x^))) , (10) 

R-type : X 4 = x-^ © Fk^{x^) © Fk^{x^) © Fk,^{x^ © F’fei(x^)) , (11) 

L-type : xf = Ffcj {Fk^ (x^) © Fk^ (x^ © Fk^ ■ (12) 

By (10) and (12), it is easy to know that the algebraic degrees of functions 
representing X 4 ^ of C-type and x§ of L-type are at least d^. On the contrary, 
from (11), we obtain the fact that in the 5 rounds R-type structure, the algebraic 
degree of function representing xf can be decreased down to d if we choose 
linearly independent vectors ai, • • • , Od such that Vaj* = 0, 1 < j < d. 
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We investigate the relationship between the number of rounds of three types 
and the security against higher order differential attack to recover the last round 
keys. In the cases of C-type and L-type structures, the minimal algebraic degree 
for the attack is d where the number of rounds is 4, and that is d? , ■■■ 
for 5 rounds, 6 rounds, • • • , so on, respectively. However, in the case of R-type 
structure, the minimal algebraic degree for the attack after 4 rounds is increasing 
as d times per two rounds. We summarize these results in the following theorem. 

Theorem 2 Assume that r denotes the number of rounds with r > 4 and the 
algebraic degree of the round function F is d. Then the required plaintexts and 
running time for the higher order differential attacks of r rounds C-type, R-type, 
and L-type structures in order to recover the last round keys are as Table 1. In 
Table 1, [cc] denotes the greatest integer not exceeding x, and the running time 
represents the number of computations of round function F. 



Table 1. Higher order differential attack of three types 



Types 


if of rounds 


if of chosen plaintexts 


Running time 


C-type 


r > 4 




r . 2'*’’”'’ -1- 2'^'’""'+'* 


R-type 


r > 4 






L-type 


r > 4 


2^ 


r . 2'*’’"^ -1- 2'*’'“'’+" 



4.3 Probabilistic Higher Order Differential Attack on the 4 or more 
rounds block ciphers with two-block structures 

Recently, Iwata and Kurosawa [4] showed that a Feistel type block cipher is bro- 
ken where the round function is approximated by a low degree vectorial Boolean 
function. They called this attack a probabilistic higher order differential attack 
because it was a generalization of the higher order differential attack to proba- 
bilistic one. In this subsection, we study the security against probabilistic higher 
order differential attack for block ciphers of two-block ciphers. At first, we in- 
troduce the notion of (d, /i)-expression in [5,4]. 

Definition 3 A vectorial Boolean function F is (d, p,)- expressible if there exists 
a vectorial Boolean function F' such that deg(F )< d and 

Proo{F{x) = F'{x)) > p . 

Now we assume that the round functions of C-type, R-type, and L-type 
structures are all (d, /i)-expressible. Then by the similar process to above two 
subsections and the proof of Theorem 4.1 of [4], we can obtain the following 
theorem. 
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Theorem 3 Assume that r denotes the number of rounds with r > 4 and the 
round function F is (d, ^)-expressible. Then the required plaintexts and running 
time for the probabilistic higher order differential attacks of r rounds C-type, 
R-type, and L-type structures in order to recover the last round subkeys are as 
Table 2. In Table 2, N denotes the number of running times Algorithm 1 of [i] 
and the success probability is given by 

Y, (")!>•<! -»>)""■( E 

l<i<N ^ ' j 

where p = 1 — 2‘^+^n(l — /i). 



Table 2. Probabilistic higher order differential attack of three types 



Types 


If of rounds 


If of chosen plaintexts 


Running time 


C-type 


r > 4 




N[r ■ 2'^"”" -b 2'*’'”"+") 


R-type 


r > 4 






L-type 


r > 4 




N{r ■ 2'^"”" -b 2'*’'""+") 



A merit of the R-type structure is that it realizes parallel computation of the 
round functions without losing provable security against DC and LC. Thus from 
a view point of computational efficiency, R-type structure is better than C-type 
and L-type structures. However, by Theorem 2 and 3, we can obtain the fact that 
the R-type is weaker than C-type and L-type from a view point of security against 
(probabilistic) higher order differential attack. Here we comprehend again that 
it is hard to the security is compatible with computational efficiency. 

On the other hand it is generally believed that designing a cryptographically 
good random function is easier than designing a good random permutation. 
Using this and based on our result, we can conclude that C-type is the best 
choice from these two points of view. 

5 Conclusion 

In this work we investigated the security against higher order differential attack 
for two-block structures of three different types which are provably secure against 
DC and LC. We basically used the relationship between the algebraic degree of 
round function and higher order differential attack. In the case of 4 rounds 
encryption, we showed that C-type, R-type, and L-type structures all have the 
same security against higher order differential attack and probabilistic higher 
order differential attack. Further, we proved that in the case of 5 or more rounds 
encryption, C-type and L-type are stronger than R-type in the view point of 
security against higher order differential attack and probabilistic higher order 
differential attack. 
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Abstract. The encryption algorithm KASUMI is referred to MISTY, 
proposed by Matsui, is a provably secure against Linear cryptoanalysis 
and Differential attack. We attacked KASUMI without FL functions by 
using Higher Order Differential Attack. The necessary order of Higher 
Order Differential Attack depends on the degree of F function and it is 
determined by the chosen plaintext. We found effective chosen plaintext 
which enables the attack to 4 round KASUMI without FL functions. As 
the result, we can attack it using 2nd order differentials. This attack 
needs about 1,416 chosen plaintexts. 



1 Introduction 

Within the security architecture of the 3GPP system is based on the KASUMI. 
KASUMI is a block cipher that produce a 64 [bit] output from a 64 [bit] input 
under the control of a 128 [bit] key. It has DES-like structure with 8 round f 
function named FO. KASUMI is based on MISTYl proposed by Matsui which 
is probably secure against Linear cryptoanalysis and Differential cryptoanalysis. 
From this fact, the main part of security of KASUMI will be guaranteed by FO 
function. Thus we estimate the strength of FO function against Higher Order 
Differential Attack in this paper. 

Higher Order Differential Attack is a chosen plaintext attack which uses the 
fact that the value of higher order differential of the output does not depend 
on the key. The sufficient order for the attack affects the necessary number of 
chosen plaintexts and computational cost. So the attacker needs to search for 
the effective choice of plaintexts. 

In this paper, we show that 4 round KASUMI without FL functions can be 
attackable using effective 2nd order differentials. This attack needs 1,416 chosen 
plaintexts and computational cost. 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 14-21, 2001. 
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G fee 





(i) KASUMI without FL (ii) Equivalent FO (iii) Equivalent FI 

Fig. 1. KASUMI without FL, Equivalent FO and Equivalent FI 



2 Modified KASUMI 



KASUMI is referred to MISTYl. The differences between them are as follows. 



— The design of S-boxes, S7 and S9. 

— The design of FL function. 

— The number of S-box (S7) in a FI function. 

Since the main part of security of KASUMI will be guaranteed by FO function, 
we omit FL function in this paper. And to simplify the Attack equation, we 
deduce the equivalent FO function and FI function. Figure 1 shows them. FO 
function is consisted of 3 round FI functions and FI function is consisted of 2 
kinds of S-boxes called S7 and S9. S7 is 7 [bit] table and the algebraic degree is 
3. S9 is 9[bit] table and the degree is 2. We denote FO function in i-th round 
as FOi, j-th FI function in FOi as Flij, and equivalent sub-key as kijk for fc-th 
S-box. 
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.^(i-2) 



FO 



-^3 



I? 

Cl 



I? 

Cr 



Fig. 2. Last round of r round KASUMI without FL 



3 Higher Order Differential Attack 

3.1 Higher Order Differential 

Let F{X; K) be a function : GF(2)” x GF(2)® GF(2)™. 

Y = F{X;K), (A‘GGF(2)”,yGGF(2)™,A'GGF(2)") (1) 

Let {A\,A 2 , . . . ,Ai) be a set of linear independent vectors in GF(2)" and 
be a subspace spanned by the set. We define A^'‘^F{X\K) as the i-th order 
differential of F{X] K) with respect to X as follows. 

Z\Wf(X; AT) = 0 F{X + A-K) (2) 

If degjfF(X; K) = N , we have the following properties. 

degx{F{X]K)} = N AT) = 0 (3) 



3.2 Attacking Equation 

Figure 2 shows the last round of i round KASUMI. which is the 

output from FOi _2 can be calculated by cipher text Cl{P), Cr{P)g GF(2)^^ 
and sub- key as follows. 

F(*-2)(p) =FO(C'L(P);A«) + Cfl(P) (4) 

If degjfiL^*“^^(P) = N, following equation holds. 

A(^+i)iL(*-2)(P) = 0 (5) 

where P(-) denotes the function GF(2)^^ x GF(2)^®^(*“^) GF(2)^^. 

^(i - (i- 2 )) (;jejiotes the set of keys for previous (i — 2) rounds. 

From equations (4) and (5), we can derive the following equation. 

0 {FO(CL(P + A);AW) + Cfl(P + A)} = 0 (6) 

^gy(N+i) 

If the right hand of this equation can be determined for some analytical method, 
we can use this equation (6) as the Attack equation for 
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4 Analysis of Modified KASUMI 

4.1 Effective chosen plaintext 

The order of Higher Order Differential Attack affects the necessary number of 
chosen plaintexts and computational cost. The attacker needs to search for the 
minimum one. The order of output is determined by the choice of variable bits. 

Let P — {Pl,Pr) PliPr G GF(2)^^ be the plaintext. Then can be 
calculated as follows. 

F02(F0i(Pl; Ki) © Pr; K 2 ) ® Pl = (7) 

Pl passes 2 FO functions. On the other hand, Pr passes only 1. So we fix Pl in 
the following to slow increase of the degree of output. 

4.2 Formal analysis of degree 

In this section, we show the formal analysis of degree. Figure 3 shows the increase 
of degree in FI21 and FO2. The symbol < a\b > denotes that the degree of left 
block is a and the right block is b. This figure shows that the degree of output 
from FI21 and FI22 is < 9|4 > and the degree of output from FI23 is < 54|36 >. 

In the same way, the degree of output from FO3 will be < 324|216 >. As men- 
tioned before, we fix half of 64[bit] plaintext, so we can not calculate larger than 
32nd order differential. Thus we can not calculate the higher order differential 
of in the simple way. 

4.3 Computational analysis of degree 

As mentioned before, the degree of output depends on the choice of variable bits. 
We made brute force search for the effective choice of variable bits in Pr for 2nd 
~ 7th order differential by the computer simulations. 

Due to the limited space, we show a part of result of 77*-^^ in Table 1. The 
symbol and denote left a[bit] of sub-block A and right a[bit] of sub- 
block A. The symbol denotes m-th bit of A°'^ and A^ denotes m-th bit of 
j^aR (jyj — 0 ~ (a — 1)). 

We have not found the effective choice for yet. This is our future work. 

5 Attack of KASUMI 

From the computer simulations, we conclude that 4round KASUMI without FL 
functions can be attackable using 2nd order differential. For 2nd order differen- 
tial, following holds. 

^(2)^16L ^ Q 

From equation (6), we can derive following Attack equation. 

0 FO(Cl(P);/C 4 )®Ck(P) = = 0 (9) 

Aey(i) 
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n 


Choice of variable bits 


Output sub-blocks or bits hold — 0 


2 


pl6L pl6fl 

©1) ’©1) 


rIBL 


pl6L pl6i? 

^(1) ’-^(2) 




pl6L pl6i? 
^(16) ’-^(16) 


3 


pl6R pl6L 
©1,3) ’©1) 


tt16L pl6fl 

’ -^(1,5, 7, 8, 9) 


pl6R pl6L 
©1.3) ’©2) 




pl6R pl6L 
©1,3)’©16) 


4 


pl6R pl6i< pl6R 
©1,2).©1) .©3) 


tt 16 L pl6i? 

^ ’-^(1~9) 


pl6fi pl6Jb pl6fi 
©1,2).©1) >©4) 




pl6fi pl6Jb pl6R 
Ml,2)’-^(16) ’M16) 


5 


pl6R pl6L pl6i? 

©1,2,3) >©1) >©4) 


tt 16 L tt 16 R 

^ ’-^(1~9) 


pl6R pl6L pl6i? 




pl6R pl6L pl6i? 

Mi,2.3')’-^a6)’-^a6) 


6 


pl6R pl6fl pl6R pl6R 

-^(l,3.4V-^i8) ’-^(9) ’-^riO') 


/j(2) 


pl6R pl6i? pl6R pl6R 

© 1 . 3 , 4). ©8) ’©9) ’©11) 




pl6R pl6i? pl6R pl6R 

-'■^(1,3,4) ’M14) ’MIS') ’Mie) 


7 


pl6R pl6R pl6R pl6i? 

M1.2.3.4VM8) ’M9) ’MIO) 




pl6fi pl6R pl6H pl6i? 

M1,2.3,4)’M8) ’M9) ’Mil) 




pl6i? pl6R pl6i? pl6i? 

M1,2,3,4)’M14) ’M15) ’M16) 



Table 1. Result of computer simulation for 



where 

/C 4 = (^411, ^412, ^413: ^421, ^422, ^ 423 ) 

^411, ^413) ^421, ^423 G GF(2)® ^412, ^422 G GF(2)^ 

5.1 Preparation 

There are 50 [bit] unknowns in the Attack equation (9). We can not determine 
them by brute force search. In this paper, we adapt the algebraic method to 
solve the equation. The attacker solves an Attack equation using known cipher 
texts and plaintexts. Let Z be the output from FO 4 . 

Z ^R7L^^R9R^ 

We can calculate which has the smallest degree among Z as follows. 

ZL9R ^ S9{k413 © © S9{k411 © C^\C^)) 
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(i) FO 2 (ii) FI 21 

Fig. 3. Increasemnet of degree in FO 2 



©59(^411 © Cf |C2^) © S'7(fc412 © c^) 

©59(^423 © © 59(fc421 © C^\C^)) 

©59(fc421 © C^\C^) © 57(^422 © C^) 

®C^\C^ © ke^\ke^ ( 11 ) 

We regard all the variable term with respect to JC as the independent variables. 
The highest term of is S' 9 (/c 4 i 3 © © 5 ' 9 (fc 4 n © CilC^))- So the degree 

of with respect to C is 4. Since = 0 holds, we have following. 

Z\(2){Z“^©C'fl(P)} = 0 (12) 

Note that the appropriate bits of Cr{P) is selected for this equation. By solving 
this equation, we can determine ^ 411 , /C 412 , /C 413 , /C 421 , /C 422 , A: 423 . From equation 
(9), we have following. 

0 Z^^^{CL{Py,K.i)®CR{P)=Q ( 13 ) 

Pev('^'> 
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We can transform this equation as follows. 

0 = 0 Cr{P) 

Pey(2) Pey(2) 

0 Z^^^{Cl{P)®Cl{Q)-JC^)= 0 Cr{P) 

Pey(2) PeT(2) 

^ \{ 0 } 

0 Z\«Z^9«(Cp(P);/C4)= 0 Cr{P) 

Pey(2) Pey(2) 

^ \{ 0 } 

wfere denotes that except All-zero. 

By this transformation, the degree of left hand side of the equation with respect 
to JC becomes 3. Attacker can calculate the constant value of right hand side of 
the equation. As the result, we have following Attack equation. 

0 ZiW^“^(CL(P);/C4) = const (14) 



5.2 Estimation of necessary number of chosen plaintexts 

The term S9{k4n 0 Cf ICl") has 9[bit] unknown sub-key whose degree is 2. So 
there are gC 2 + gCi = 45 unknowns. The S'7(fc4i2 0 C^) has 7[bit] unknown 
sub-key whose degree is 3. So there are 7C3 0 7C2 0 7C1 = 63 unknowns. The 
term S'9(fc4i3 0C'3'0S'9(fc4ii0Cf'|C|')) has 9C'2 0gC'i09 = 54 unknowns whose 
degree is 2. So there are 54 C2 + = 1,485 unknowns. In the same way we 

calculate number of unknowns in the term S'9(fc42i 0 Cf^lCl*), S7{k422 0 C^), 
5'9(/c 423 e <^3^ e 5'9(fc42i 0 C(^\C^)). As the result, we found there are 3,186 
unknowns in the Attack equation. 

We can deduce 9 linear equations from one Attack equation because At- 
tack equation is the vector equation on GF(2)®. To solve the equation, we need 
3,186/9=354 different 2nd order differentials. Thus we need 2^ x 354 = 1,416 
chosen plaintexts to attack 4 round KASUMI without FL functions. 



6 Conclusion 

We showed that 4 round KASUMI cipher without FL functions is attackable 
by Higher Order Differential Attack using 2nd order differentials. We adapt the 
algebraic method to solve Attack equation. By this attack, we will be able to 
determine 6 sub-keys in 4th round using 1,416 chosen plaintexts. And we estimate 
that necessary computational cost is times FO function operations. The 

algebraic method will be 2^® ®® times faster than brute force search which needs 
2®^ times FO function operations. 
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Abstract. MISTYl is a block cipher whose design relies on an asser- 
tion of provable security against linear and differential cryptanalysis. 
Yet, a simplified and round reduced version of MISTYl that does not al- 
ter the security provability can be attacked with higher order differential 
cryptanalysis. We managed to explain this attack by deriving the attack- 
ing property from the choice of an atomic component of the algorithm, 
namely one of the two MISTYl S-boxes. This allowed us to classify the 
good and the bad S-boxes built with the same principles and to show that 
none of the S-boxes with optimal linear and differential properties has an 
optimal behaviour with respect to higher order differential cryptanalysis. 



1 Introduction 

The encryption algorithm MISTYl is a block cipher designed by Matsui in 1996 
(see [5]) with the achievements of provable security and reasonable speed in both 
software and hardware. It is built on a Feistel-scheme, iterating on 8 rounds a 
function named FO. The original built also uses in each round a small and 
fast linear function named FL. As the design with FO is proved secure against 
differential and linear cryptanalysis, the purpose of adding FL functions is to 
prevent attacks other than differential and linear, but with no proved security. 
Section 2 gives the scheme of MISTYl. 

Restricting themselves to the “provable secure part” of MISTYl, [9] and 

[8] showed that a reduced version of MISTYl with only 5 rounds and with 
no FL functions is attackable by higher order differential cryptanalysis. The 
authors gained hope in such an attack by noticing that the algebraic degree 
of the overall cipher does not grow very fast in the first 3 rounds. Indeed, the 
scheme uses two S-boxes which themselves have rather low algebraic degree, well 
suited for hardware implementation. The attack will be explained in Section 3. 

Increasing the algebraic degree of S-boxes would prevent such an attack. Yet, 

[9] and [8] do not explain the origin of the attacking property. For instance, it 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 22-36, 2001. 
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could not be said that increasing the algebraic degree of the S-boxes was the 
only way to change the degree of the overall cipher and prevent higher order 
differential attacks. We managed to derive the higher order differential property 
used by the attackers from a property on St, an S-box permutation on 7 bits, 
which will provide a link between the chosen exponent and the attack. Moreover, 
we were able to classify the exponents in that would allow a higher order 
differential attack on the weaker version of MISTYl, and to compare them to the 
good exponents for provable security against differential and linear cryptanalysis. 
We explain how in Section 4. 

2 Presentation of MISTY 1 

This Section recalls the construction of the block-cipher MISTYl as defined in 
[5]. 

MISTYl is designed around small block schemes used recursively. There are 
three levels of these schemes. The small one, with 16-bit input, is called FI 
(figure 2), and is built with three calls to two permutation S-boxes, and Sq 
with respective inputs on 7 and 9 bits. The intermediate scheme, with 32-bit 
input, is called FO (figure 2) and makes three calls to FI. The main scheme 
MISTYl is a block cipher with 64-bit input, whose number of rounds should 
be a multiple of 4. Each two rounds (figure 1) is made of two calls to FO 
and two calls to a small linear function FL (figure 2). The proof of security of 
such a scheme against differential and linear cryptanalysis uses generalisations 
of techniques first introduced in [7] (resistance against differential attacks) and 
in [6] (resistance against linear attacks). 

The key schedule produces three series of keys {i referring to the round 
number) : KLij{j = 1,2) used in FLi, KOij{j = 1...4) used in FOi, and 
KlijkU = 1 . . .4,A: = 1,2) used in F7y. 

There is an equivalent key schedule that makes things a little bit easier to 
see on MISTYl ’s schemes (see figure 3). KOi and Klij series can be joined in a 
single series = 1 ... 3, A: = 1 ... 3) used in Flij, and a single key word K® 

XORed with the algorithm’s output word. A XOR with Kijk is made just before 
entering an S-box, i.e. before entering the only non-linear parts of the cipher. 

The two S-boxes are chosen in order to have good differential and linear prop- 
erties to be fast to compute in hardware, and to have the highest reasonable 
algebraic degree. Here are some basic definitions : 

Definition 1 (GF(2™)). GF(2™) denotes the unique finite field with 2™ ele- 
ments. We ehoose on GF(2'^) a polynomial basis (Y*)o<j<m-i; thanks to whieh 
we ean assimilate GF(2™) to {0,1}™. For any funetion GF(2™) GF(2"), 

X i-T- y(x), we write deg^y its algehraie degree in x (0 < deg^y <m). We will 
use the faet that degxX^ = |e| for an exponent e that is eoprime with the order 
of the multiplieative group and \e\ the Hamming weight of e ^ . 

^ This property is to have a minimal average differential and linear probability. See 

[4]. 

^ We obviously have degxx'^ < \e\. For a complete proof, see for example[2]. 
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Both S-boxes are chosen among permutations in GF(2™) having the form 
X L{x'^) ^ and a linear permutation L. Choosing the degree in order to keep 
a reasonably short hardware computation time, the exponent is of Hamming 
weight 3 for Sy and 2 for Sq. Explicitly, it is 63 =81 for Sy. 



3 Higher order differential attack on a reduced version of 
MISTYl 

3.1 Basic definitions, notations, and facts 

Definition 2 (A:*^-order differentials). Let F : GE(2™) GE(2") he afune- 

tion. We eall the -order differential of F aeeording to a k dimensional veetor 



® Note that for such a permutation, one necessarily has gcd(2™ — 1, e) = 1. 
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space V C GF(2'^) the mapping : 

AvF : GF(2™) GF(2”) 

X ^ AvF(x) = 0^gy F(x © v) 

We denote by [F]^ the sum of the terms in the algebraic normal form of F whose 
degrees are at least d . For example, [X 0 X 2 X 4 X 5 +xqXiX 2 +X 1 X 2 X 3 +xqXi + l]s = 

X0X2X4X5 + X0X1X2 +X1X2X3. 

Fact 1 We have the following properties for all x,x' € GF{T^), x' ® x : 

AvF{x') = AvF{x) (1) 

— dim (2) 

Using (1) when V is GF{T^), we can write A\>F to refer to A\>F(0), or even 
A„F (with variable v instead of vector space V) when no mistake is possible. 

Definition 3 (Bit manipnlations). If E belongs to {0,1}™, we will write 
(E : n) to refer to the n-bit word deduced from E either by extension on the left 
with n — m zero bits ifn>morby truncature on the left ifn < to. For instance, 
0 : 32 will mean a 32-bit zero word. , and will respectively refer 

to : the left half (more significant bits), the 9 left most bits, the right half (less 
significant bits), and the 7 right most bits of E. Fi||F 2 will be the concatenation 
of El and E 2 . We will let rev(E) be the mirror-reversed bits of E. We denote fl 
the bitwise AND and U the bitwise OR. E «< k denotes the cyclic left shift of 
k bits of E ; in other words, it is where E is seen as an element ofZ/mZ. 
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3.2 Description of the attack 

Definition 4 (Ml'). We name Ml' the redueed version of MISTYl that has 5 
rounds and no FL-funetion. As in the presentation of MISTYl above, we let Xq 
be the right 32-bit word entering Ml', Xi the left 32-bit word, and (X,_|_i,X,) 
the intermediate value in Ml' after i rounds. We also let x = 

7 right bits of Xq (x = X^"^), and y(Xi, Xq, K) (or y) be the 7 left most bits of 
FO3 ’s output word, while {X \ , Xq) is the entering word of Ml' and K is the key. 
We eall “eonstant” and denote est everything that does not depend on x = X^"^ 
(i.e. whose degree in x is zero). 

With these notations, we can easily check on Ml' the following property : 

= C (3) 

to be understood as : 

0 y{X^^^\\x,X^,K)=C (4) 

x€GF{2r) 

C being a constant independant of K and of the input bits {Xi,Xq‘^^). (3) says 
that C is the 7*^ order differential of the function {Xq,Xi,K) y with respect 
to the subspace described by x. In other words, this means that the highest 
degree monomial in x coefficients in the expression of the y bits as boolean 
functions of a;’s 7 bits are independant of all other parameters K, Xi, and Wq 
P roperty (3) will turn out to come from a property of (see next Section) ; 
thus, the constant C will strongly depend on the choice of Sy. C is actually very 
easy to compute, provided that we have an implementation of with formula 
(4). With the Sy chosen in [5], the constant turns out to be IIOIIOI2. 

This property can be used to perform a chosen plaintext attack of Ml' in the 
following way. In Ml' , the plaintext is (Wi, Wq), and the ciphertext is (W5, Wg). 
X5 and Xq are functions of the plaintext {Xi,Xq‘^^ , x). Considering the last 
round, we have Xi{x) = Xq{x)®FO{X^{x),K^). We can compute the 7*^ order 
differential with respect to x of the 7 left most bits in this expression. As Wf''’ 
is the sum of y and of {x © cst), we have A^Xf^ = A^y. With (3), we obtain 
the following attacking equation : 

0 {X^^X^,X^^\x)®FO^HX5{X^,X^^\x);K5)) = C (5) 

xeGF(2'^) 

We do not need to compute the whole FO5 function, but just the 7 left most 
bits. This means that we only need to compute FI^^ and FI^J ■ To compute 
these, we just go through Sq once and Sy once. When we sum in (5), some 
monomials of key bits vanish. With one chosen plaintext, we get one equation 
on monomials on subkey bits. The subkeys involved are and K521, each 

* the highest degree monomial is 2:02^13:^22^32^42^63^6. 
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one with degree 1 in 9 bits, and 7T512 and 7T522, each one with degree 2 in 7 bits. 
One can linearize terms of degree two. Finally, the attack can be performed by 
choosing enough set of 2'’ plaintexts associated to various Xi, values in 

order to get enough such linearized equations (see [9]), or simply by exhaustive 
search on those subkeys. 

4 Where does this higher order differential property 
come from ? 

In this section, we derives property (3) from another property on This de- 
duction starts with analytic details on Ml' scheme in order to have an explicit 
expression of y (subsection 4.1). The analysis of the highest degree monomial of 
y gives us a sufficient property on Sy for property (3) to hold, (subsection 4.2). 
This allows us to characterize the exponents of Sy-like S-boxes with respect to 
property (3) (subsection 4.3). Finally, we comment the choice of the exponent 
63 in Sy (subsection 4.4). 



4.1 Low-degree property on 3 rounds 

We can summarize the phenomenon as follows : the 7 left bits y of FO 3 ’s output 
appear to come from the 7 right bits x of Wq. a;’s bits having gone through few 
S-boxes, the degree of most intermediate bits involved in computation of y in 
the X bits is pretty low. 

To help the reader, we have tried to summarize the steps on figures 4 and 5. 

1^* round. Xi goes through FOi and is added to Xq : the intermediate value 
after one round is {X 2 = Xq © cst , Xi = cst). 

2nd round. X2 goes through FO2 and is added to X^. X2 is a 32-bit word, 
whose constant left 16-bit word goes through FI 21 , which gives another 
constant. X 2 is added onto this constant. The result cst © X 2 on one hand is 
added to the output a of FI 22 , and on the other hand goes through FI 23 to give 
p. We have a = FI 22 {Xq © cst) and /? = Fl 23 {X^ © cst) (see figure 4). 

Let us consider the degree (in x) going out of F 02 - If we call A the right 
16-bit word of F 02 ’s output, we can see that A’s degree in x is only 2. Indeed, 
A = /?©Q!©AL^©cst. Moreover, (W^©cst) will raise an expression of degree 
in X equal to 3, and with coefficients of degree 3 independant of constants So 
that we have [ajs = [/?]3, and then [A]3 = [q !]3 © [/?]3 = 0. Thus, deg(A) < 2. 

® This is easy to check on the design of an F/-box : the variable are concentrated in 
the 7 right most bits of the 16-bit input, so that they just go through at most one S- 
box. Sg has degree 2 and S 7 has degree 3, which means that deg(5g(P)) < 2. deg(P) 
and deg(57(P)) < 3. deg(P) for any polynomial P. For P = x (B cst, ie deg(P) = 1, 
Sg{P) and 57 (F) have respective degree 2 and 3, with highest degree coefficients 
independant of constants, because highest degree monomials of the output come 
from highest degree monomials of the input. 
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Now, let us consider /r, the left 16-bit word of F 02 ’s output. We have ji = 
a © Xq © cst, and deg(/r) = 3 for the same reason as above. We will actually 
need to be more precise. We will need the expression of This is easy to 
derive from FI design (we omit to write any 9-bit to 7-bit truncature and any 
7-bit to 9-bit zero-extension) : 

= (a; © cst) © Fl^i^x © cst) 

= (a; © cst) © (5g(a; © cst) © 57(a; © cst) © a; © cst) 

= 57(a; © cst) © 5g(a; © cst) © cst (6) 

Note that, by truncature, this is also the expression of 

To summarize the output Wg of the 2"*^ round, we have, from left to right, 
9 bits of degree 3 7 bits of degree 3 in (6)), and 16 bits of degree 2 (A). 

3'"'^ round. Looking at the 7*^ degree monomial at the output of FO 3 , we 
will establish the following result : 

[y]7 = [57(/r^^©cst)]7 (7) 

As for the 2"*^ round, we need to follow thoroughly the design of FO 3 to derive 
this equation (see figure 5). As input of FI 31 , we have the 16-bit word /r = 
^L9||^i?7, .(^hose degree is 3. The 7 left most bits going out of FI 31 come from 
the expression STip^"^ © cst) © Sq(p^^ © cst) © © cst. Considering the 7*^ 

degree part of the terms, we obviously have = [cst]7 = 0. As deg,^ = 3, 

we also have deg,^ S 3 {p^^ © cst) = 6, so that [5g(/r'^® © cst)]7 = 0, and finally we 
see that the 7 left most bits of the output of FI 31 have 7*^ degree monomials 
that are © cst)] 7. 

Following the intermediate values in FO 3 , we then see that onto this output 
is added the right 16-bit part A of FOg’s input, whose degree is 2. The 7*^ degree 
monomial of the 7 left most bits stays unchanged. It is then added to the output 
of FI 32 with input of degree 2, before reaching y. The 7 left most bits of the 
output of FI 32 actually have a degree equal to 6 : as above, we call A F/gg’s 
input with deg,^ A = 2, and we see that the 7 left most bits of the output come 
from the expression 57(A^'’ © cst) © Sq(X^^ © cst) © A^'’ © cst, whose highest 
degree monomials come from so that this highest degree is 6. Thus, It 

has no contribution to the 7*^ degree of the 7 left most bits. This shows that 
equation (7) holds. 

As a conclusion, we derive from (6) and (7) the expression of [y]7 according 
to X and constants : 

[y]7 = [57(57(2; © cst) © Sq{x © cst) © CSt)]7 

Note that [^[7 = (Axy) ■(xoXiX 2 X 3 X 4 X 3 Xe), and then, with properties (1) and 
(2), we can make a change of variable that leads to a monomial F of degree 7 
(and giving a name to constants) : 

[yh = F{x) = [57(57(2;) © 5g(2; © c) © c')]7 (8) 
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We should notice that there is no such basic equation that holds when we put 
in the FL functions. Indeed, as they mix bits, even though linearly, they make 
the highest degree terms spread and interfere with each others. So that things 
do not go as well for the attacker as, for example, on figure (4). 
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4.2 Analysis of the 7*^ degree monomial 

If we call t{x) = St{x) © 5g(a; © c) © c' the argument of Sy in (8), Sy being a 
polynomial function of x of degree 3, the possible ways of obtaining a degree 7 in 
X involve products of 3 monomial of t(x) of degree 1, 2, or 3 in x, but no constant 
term. Thus, if we replace t(x) by t'(x) = [5'7(a;)]deg,^>i © [5'9(a; © c)]deg,„>ij we 
still have F{x) = [S 7 {t'{x))] 7 . We can say more : to obtain a degree 7, we need 
to pick in t'{x) 3 terms whose sum of degrees in x is greater than 7. But this is 
not precise enough to raise easy conclusions. 

It is better to consider the degrees in {x, c) ® and not just x. We classify 
the possible combinations of t'{x) monomials that give a degree 7 in a; by their 
degree in (x,c). St(x) has just monomials in x and none in c. Sq(x © c) has 
monomials of degree 2, 1, and 0 in (x,c), so that the possibilities are : 

- deg,j = 7 and deg,^ c ~ ^ ■ these monomials would only come from a product 
of 3 monomials of degree 3 in (x,c), in which there is some monomials in c. 
But this can never happen, because any degree 3 come from the term St(x) 
in t'(x), and this term does not have any monomial in c; 

- deg,j = 7 and deg,^ ^ = 8 : these monomials come from a product of two 
monomials of degree 3 (a priori in (x,c), but actually in x because they 
come from St{x)) and from one monomial of degree 2 with a degree 1 in c 
(it comes from Sq{x®c)). So that this part of F{x) can be written [S 7 {t'' {x))]y 
with t"{x) = [57(a;)]3 © [5g(a; © c)](i_i), where [/](i,i) means monomials of 
degree 2 that are degree 1 in a; and 1 in c; 

- deg,j = 7 and deg,^ '■ this raises a monomial in a;oa;ia; 2 a; 3 a; 4 a; 5 a ;6 with a 
coefficient C independant of c. 

The conclusion is that F{x) = [S 7 {t'' {x))]y © C.X 0 X 1 X 2 X 3 X 4 X 5 XQ where t"{x) = 
[ 57 (a ;)]3 © [5g(a; © c)](i_i) and where the c-dependant part of F{x) is a sum of 
products of two monomials of [ 57 (a ;)]3 and one monomial of [5g(a; © c)](i 1 ). 

The fact that makes property (3) hold with the C above, is that [S 7 {t'' {x))]y = 
0. This property is not trivial, and comes straight from the following fact : 

Fact 2 The product of two output bits of St {x) has 6*^ degree terms that always 
zeroize. 

Fact 2 is a sufficient property on Sj for property (3) on MT to hold. Taking 
into account the fact that there exists an 63 of Hamming weight 3 and a linear 
permutation L such that St(x) = L(x^^), the former property can be restated 
as follow ^ : 



VI/ 1 , 1 / 2 , VI/, deg^ Li{L{x^^).L 2 {L{x^^)) < 6 where 1 63 ] = 3 (9) 

® this is the degree of the expression seen as a polynomial in the bits of x and of c. 

’’ The property turns out to be true for the chosen St but the equivalence that will be 
Staten in fact 3 holds for any L and any es of Hamming weight 3 
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or, by changing L, into L, o L : 

VI/1,1/2, deg^ Li(t).L 2 (t) < 6 where t = and|e3|=3 (10) 

where Li and I/2 are two linear maps GF(2'’) ^ GF{2), and L € GL{GF{2'^)). 
It is equivalent to prove it for (Li, I/2) describing a basis, i.e. : 

Vi, j, deg,j ti-tj < 6 where t = and jesj = 3 (11) 

This property is true on deg^titj for any i, j if and only if it is true on degxF^ = 
degxX'^'^'^^ for any 62 with Hamming weight 2. The “only if” part is easy So 
that property (11) is equivalent to : 

Ve2, |e2| = 2, degxF'^'^^ < 6 where [esl = 3 (12) 

Fact 2 states that the above property holds for the exponent 63 used in 
MISTYl. In the subsequent sections we will show that the same property nec- 
essarly holds for any possible exponent 63 that satisfies the other properties 
required for provable security of MISTYl. 

4.3 Characterizations of exponents 

The property that allows the higher order differential attack is expressed in 
terms of the S-box in (9) and in terms of the exponent chosen in in (12). 
Property (12) turns out to be very useful to find a characterization of exponents 
with respect to the higher order differential attack. 

First, we need to put the stress on the fact that, given an S-box of the form 
X L{x^), there is not just one exponent e that can be taken but many of them. 

Definition 5 (Set of S-box exponents). On GI’(2™), we call set of S-box 
exponents and we write £m the set of exponents e such that x x^ is one to one. 
This is the set of integers between 1 and 2™ — 1 that are eoprime with 2™ — 1. 
Thus, £m is the subset of invertible elements of Z/(2™ — 1)Z. For example, 
£7 = [1,126]. 

Definition 6 (S-box eqnivalent exponents). Let ~ be the equivalence rela- 
tion defined on £m by : ei ~ 62 if there exist two linear permutations Li and 
1/2 such that Va; € GF(2™), Li{x'^^) = L 2 {x^'^). We say that ei and 62 are 
S-box equivalent exponents. We write EQUx„^{e) the equivalence class of e (i.e. 
EQU,^{e) G £^j 

The linear permutations of the form x ^ x^ are the cyclic shift maps x x‘^'‘ . 
Indeed, we have degxX^ = 1 because it is linear and degxX^ = \e\ as in definition 1. 
Thus, e is an exponent of Hamming weight 1 and such an exponent is 2* for some 
k & Z, /mZ. 

® We have checked (but not proved) that : max^ijyQ[Q.g]degxtitj = max|e2|=2 
for t such that t = x'^ and |e| < 3. 
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The relation ei ~ 62 is equivalent to : 

3M e GL(GF(2™)) \fx G GF(2™) a;"'/"" 2™-i) ^ 

which means, with the previous remark, that x is a cyclic shift map 

X ^ x“^ for some k G Z/mZ. Thus, ei ~ 62 is equivalent to : 

3k G Z/mZ, ei = 2*62 (mod 2™ - 1) (13) 

Generally, the cardinal of EQU^(e) depends on e, particularly on the bit peri- 
odicity of e. By definition, the bit periodicity of an m-bit word w is the smallest 
positive integer T such that w <<3C T = w. We write its period T{w). It is 
obvious that T{w) divides to. With (13), we have : #EQU^(e) = T(e). 

If TO is prime, exponents e G £m have a period equal to to. In the case of 
where to = 7, each class of S-box equivalent exponents has exactly 7 elements. 
Permutations on GF(2'’) of the form L{x'^) are defined modulo S-box equivalence 
classes, which means for example that a given Sy can be expressed with 7 couples 
(L,e). 

We also notice that power maps with S-box equivalent exponents have exactly 
the same differential and linear properties, since they only differ by a linear 
permutation. 

Definition 7 (Complementary Cyclic Shift Property). We say that a n- 
bit integer e has the n-hit eomplementary eyelie shift property (n-hit CCS prop- 
erty) if there exists an integer k modulo n sueh that e fl (e <<3C k) = 0. Equiva- 
lently stated, e has the n-hit CCS property if there is an exponent 62 of Hamming 
weight 2 sueh that the produet 662 has Hamming weight 2|e|. 

Definition 8 (CCS eqnivalent exponents). LetTZ be the equivalenee relation 
defined on £m by : e\TZe 2 if 3k G ei = 2*62(mod 2™ — 1) or ei = 

2*rer(e2)(mod 2™ — 1). We say that ei and 62 are eomplementary eyelie shift 
(CCS) equivalent exponents. We write CCSmie) the equivalenee elass of e (i.e. 

CCSUe) e 

This equivalence relation TZ naturally comes from the definition of the CCS 
property, because if e has this property, so have e <<3C k {\fk) and rev(e <<3C k). 
In other words, if e has the CCS property, all the exponents in CCSm(e) have 
it. We talk about a CCS-positive class if it has ; a CCS-negative class if it has 
not. 

The link between S-box equivalent exponents and CCS equivalent is easy to 
make with property (13) and definition 8. We have : 

Ve G f™, CCS„(e) = EQU„(e) U EQU„(rev(e)) (14) 

To be more precise, if e ~ rev(e), CCSm(e) = EQU^(e) and #CCSm(e) = T(e). 
If e 7^ rev(e), CCS„(e) = EQU„(e) © EQU„(rev(e)) and #CCS„(e) = 2.T(e) 
(since T{rev{e)) = T{e)). 
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Given an S-box of the form L{x'^), we can define its S-box class as the S- 
box class of e (EQU^(a; L{x^)) = EQU^(e)) and its CCS class as the CCS 
class of e (CCSm(a; L{x^)) = CCSm(e)). EQUm{x L{x^)) is well-defined 
by definition of and CCSm(a; L{x'^)) is well-defined because EQU^(e) C 
CCSm(e). We say that such an S-box is CCS-positive or CCS-negative according 
to its CCS class. 

Here is why the CCS property is interesting. 

Fact 3 The following properties are equivalent. 

- (12) holds. 

- is CCS-negative. 

- for any 62 of Hamming weight 2, the produet 6263 has a Hamming weight 
strietly lower than 6. 

If they are true, 7*^ order differential attaek on Ml' is possible. With the ehoiee 
of Sr in MISTYl, they are true. 

This fact comes straight from property (12), from definitions 7 and 8, and from 
degxX'^'^'^^ = |e2e3|. To see that this holds for MISTYl, one can easily check (9) 
with L and 63 chosen such that Sr{x) = L{x'^^). 

This allows us to classify the exponents with respect to their belonging to a 
positive or to a negative class. The 7*^ order differential property (3) will hold, 
and the attack will be possible, as soon as the exponent has the CCS property. 

4.4 On the choice of Sr exponent 

After a short study of all exponents of Hamming weight 3 in ^7, we saw that 
there are 4 CCS classes, 3 of them being positive. The negative class has 14 
elements, which means that it is made by 2 S-box equivalent exponents classes. 
It also necessarily contains 63. We give the two subclasses explicitly : 

EQU7(81) = {81, 35, 70, 13, 26, 52, 104} (15) 

EQU7(rev(81)) = {69,11,22,44,88,49,98} (16) 

The 3 other classes are positive ; they are generated (for example) by 7 (7 n 7 < 
« 3 = 0), 19 (19 n 19 «< 2 = 0), 21 (21 n 21 «< 1 = 0). Each of these CCS 
classes are also S-box equivalent exponents classes, with 7 elements. As they all 
are CCS-positive, the use of them in Sr would prevent the 7*^ order differential 
attack on Ml'. 

We have computed the average differential probability DP^’’ and average 
linear probability LP^’’ (see [5]) for functions x x'^^ with eg in each of these 
3 classes. They are always worse than the original Sr box, with a CCS-negative 
class. 

Fact 4 There is no Hamming weight 3 optimal exponent 63 with respeet to linear 
and differential properties and to higher order differential properties. None of the 
S-hoxes with optimal linear and differential properties has an optimal behaviour 
with respeet to higher oreder differential eryptanalysis. 
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Remark 1. MISTYl’s S 7 has the S-box class EQUjiU). 13 = 2^* - 2* + 1 is 
a Kasami exponent with k = 2. The other S-box class that is CCS negative is 
EQU 7 ( 11 ). 11 = 2 (™-i + 3 is a Welsh exponent with m = 7 (proved in [3]). 

These two classes are proved to be optimal with respect to differential and linear 
properties. 

The proven optimality of these two classes with respect to differential and 
linear properties are linked in our case. Indeed, the optimality of a permutation 
/ remains on the inverse permutation f~^ . On power maps x i-T- with 63 € 
CCS 7 ( 11 ), we have a property that makes and rev(e 3 ) S-box equivalent 
exponents. This property is that 63 and rev(e 3 ) are CCS equivalent but not 
S-box equivalent. This implies that we can find k such that e 3 .( 2 *rev(e 3 )) is 
of Hamming weight 6 . Thus, e 3 .( 2 *rev(e 3 )) is 2'’ — 2 up to a cyclic shift. As 
= 1 in GF(2'’), we see that the power maps x x '^3 and x x^e\(e 3 ) 
are the same up to a linear permutation. Then, if a; x'^^ is optimal with 
respect to differential and linear properties, so is a; This proves that 

all exponents in CCS 7 ( 11 ) are equivalent with respect to differential, linear, and 
(9) properties. 



5 Conclusion 



We showed that the 7*^ order differential property on MISTYl with no FL func- 
tion and on 5 rounds comes from the choice of the 87 box. It is actually not 
possible to find an S 7 box coming from a mapping x i-T- x^^ with an exponent 
63 of Hamming weight 3, that is at the same time an optimum from the points 
of view of average differential/linear probabilities and of the 7*^ order differen- 
tial property. It may be possible to prevent this higher order property without 
greatly altering the scheme, and still keeping the provable security properties. 
For instance, the S-boxes could be chosen with higher weight exponents (al- 
though that might increase the hardware implementation complexity), or the 
number of rounds in the FI scheme could be increased ® (although there is a 2"*^ 
order differential attack on KASUMI with no FL function and on 4 rounds ; see 
[ 10 ]). 
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3GPP confidentiality and integrity algorithms [1], the number of rounds in the FI 
function was increased from 3 to 4. 
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Abstract. Vaudenay[12] proposed a new way of protecting block ciphers 
against classes of attacks, which was based on the notion of decorrelation. 
He also suggested two block cipher families COCONUT and PEANUT. 
Wagner[14] suggested a new differential-style attack called boomerang at- 
tack and cryptanalyzed COCONUT’98. Cheon[5] suggested a new block 
cipher DONUT which was made by two pairwise perfect decorrelation 
modules and is secure against boomerang attack. In this paper we suggest 
an attack called difference distribution attack on DONUT. We also sug- 
gest an improved DONUT which is secure against difference distribution 
attack. 



Key words : Decorrelation, DONUT, Differential Cryptanalysis(DC), Linear 
Cryptanalysis(LC), Difference Distribution Attack(DDA). 

1 Introduction 

Vaudenay[12] proposed a new way of protecting block ciphers against differ- 
ential cryptanalysis(DC)[2, 3] and linear cryptanalysis(LC)[9] which was based 
on the notion of decorrelation. This notion is similar to that of universal functions 
which was introduced by Carter and Wegman[4, 15]. Vaudenay also suggested 
two block cipher families COCONUT(Cipher Organized with Cute Operations 
and NUT) and PEANUT(Pretty Encryption algorithm with NUT). COCONUT 
family used a pairwise perfect decorrelation module and PEANUT family used 
a partial decorrelation module. 

COCONUT’98 cipher is a product cipher o C 2 o C\, where C\ and C 3 
are d-round Feistel ciphers and C 2 is a pairwise perfect decorrelation module. 
Wagner[14] suggested a new differential-style attack called boomerang attack 
and cryptanalysed COCONUT’98. Cheon[5] suggested a new block cipher called 

* This work is supported by Korea Information Security Agency(KISA) grant 2000-S- 
078. 
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DONUT(Double Operations with NUT) which was made by two pairwise per- 
fect decorrelation modules. DONUT is secure against boomerang attack. In this 
paper we will suggest an attack called difference distribution attack(DDA) and 
apply this attack to DONUT. We will also suggest an improved DONUT which 
is secure against difference distribution attack. 

This paper is organized as follows. In section 2, we recall the basic definitions 
used in the decorrelation theory and present the previous results of decorrelation 
theory. In section 3, we describe the structure of DONUT and in section 4, we 
introduce the difference distribution attack and apply this attack to DONUT. 
In section 5, we suggest an improved DONUT and in section 6, we estimate the 
security of improved DONUT against DC, LC, boomerang attack. Finally in 
section 7, we conclude the paper. 

2 Preliminaries 

In this section, we recall the basic definitions used in the decorrelation theory 
and briefly present the previous result[12, 13]. 

Definition 1. Given a random function F from a given set Mi to a given set 
M 2 and an integer d, we define the “d-wise distribution matrix” [Ff of F as a 
Mf X M 2 -matrix where the {x,y)-entry of [F^ corresponding to the multipoints 
X = {xi, - ■ ■ , Xd) G Mf and y = {yi, - ■ ■ ,yd) G Mf is defined as the probability 
that we have F{xi) = yi for i= 1, • • • , d. 

Each row of the d-wise distribution matrix corresponds to the distribution 
of the d-tuple (F{xi), • • • , F{xd)) where (xi, • • • , Xd) corresponds to the index of 
the row. 

Definition 2. Given two random functions F and G from a given set M\ to a 
given set M 2 , an integer d and a distance D over the matrix space ^ 

we define the “d-wise decorrelation D-distance between F and G” as being the 
distance 

DecFf){F, G) = D{[fY, [G]''). 

We also define the “d-wise decorrelation D-bias of function F” as being the 
distance 

DecF'^iF) = D{[fY, 

where F* is a uniformly distributed random function from Mi to M 2 - Similarly, 
for Ml = M 2 , if C is a random permutation over Mi we define the “d-wise 
decorrelation D-bias of permutation C” as being the distance 

DecPi{C) = D{[Cf, [G*]'^) 

where C* is a uniformly distributed random permutation over Mi. 

In the above definition, C* is called the Perfect Cipher. If a cipher G has 
zero d-wise decorrelation bias, we call G a perfectly decorrelated cipher. When 
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message space M = {0, 1}™ has a field structure, we can construct pairwise 
perfectly decorrelated ciphers on M by C{y) = A ■ y + B where key K — {A, B) 
is uniformly distributed on x M . This pairwise perfect decorrelation module 
is used for COCONUT family. 

COCONUT is a family of ciphers parameterized by {m,p{x)), where m is 
the block size and p{x) is an irreducible polynomial of degree m in GF{2)[x]. A 
COCONUT cipher is a product cipher C 3 o C 2 o Ci, where C\ and C 3 are any 
(possibly weak) ciphers, and C 2 is defined as follows: 

C 2 {y) = A ■ y + B mod p{x), 

where A, B and y are polynomials of degree at most m — 1 in GF{2)[x]. The 
polynomials A and B are secret and act as round keys. Since the COCONUT 
family has pairwise perfect decorrelation, the ciphers are secure against the basic 
differential and basic linear cryptanalysis[ 12 ]. 

COCONUT’98 is a member of COCONUT family parameterized by (64, a;®^+ 
x^^ +x^ + X+1) and uses 4-round Feistel structures for G\ and C 3 , respectively. 
Wagner[14] cryptanalysed COCONUT’98 using boomerang attack, which ex- 
ploits that high probability differentials exist for both C\ and C 3 . 

3 Structure of DONUT 

Ftame Structure DONUT transforms a 128-bit plaintext block into a 128-bit 
ciphertext block. DONUT uses variable key length and consists of 4 rounds. 
The first round and the fourth round consist of pairwise perfect decorrelation 
modules Ax ■ y + Bi and A 2 ■ y + B 2 where Ai, B\, A 2 , B 2 are 128-bit subkeys. 
The inner 2-round transformation consists of two Feistel permutations and each 
round uses six 32-bit subkeys(see Fig.l). 



Round Function F The round function F of DONUT consists of three G 
functions. A 64-bit input of F is split into two 32-bit words and 3-round trans- 
formation is followed with inner function G. Fig. 2 shows the structure of F 
function. 



inner function G The function G is a key-dependent permutation on 32-bit 
words with two 32-bit subkeys. The function G which we call the SDS function 
consists of 5 layers as follows: 

1. The first key addition layer. 

2. The first substitution layer. 

3. The diffusion layer. 

4. The second key addition layer. 

5. The second substitution layer. 

Fig. 3 shows the structure of G function. 




I? 

Ciphertext 



Fig. 1. Frame Structure 



X = (x\,X 2 ) 




l2 

I? 

Y = {yi,y2) 



Fig. 2. structure of F 
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Y = (yi,y2,y3,m) 



Fig. 3. structure of G function 



Substitution Layer We use the same S-box which is 8-bit input/output per- 
mutation as the substitution layer. The S-box is constructed by the function of 
the form a-x~^(Bb, where a — 0xa5, b — 0x37 G GF(2®). The Galois field GF(2®) 
is defined by the irreducible polynomial x®-|-a;^-|-a:^-|-a;^-|-l(hex : 0x1 Id). In the 
GF(2®), the x~^ has a good resistance against differential and linear attacks. 
The purpose of using affine transform is preventing from two fixed points such 
as zero to zero and one to one in the function x~^. 

Diffusion Layer The diffusion layer is performed by the 4x4 circulant-matrix 

D. 

/Of 06 07 02\ 

06 07 02 01 

07 02 01 06 
\02 01 06 07 

Let X = (x 3 ,X 2 ,Xi,Xo) = input of the diffusion layer and 

V = (d 3 , 2 / 2 , 2 / 1 , do) = Si=o2/*2** output of the diffusion layer. Then we 

have the followings: 



( 




/01 06 07 02 \ 




/ Xq\ 




/ 01 • xo © 06 • xi © 07 • X2 © 02 • X3 \ 


2/1 




06 07 02 01 




Xi 




06 • xo © 07 • © 02 • X2 © 01 • X3 


2/2 




07 02 01 06 




X2 




07 • xo © 02 • xi © 01 • X2 © 06 • X3 


\y3j 




1^02 01 06 07) 




\X3j 




^^02 • xo © 01 • xi © 06 • X2 © 07 • xay 
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Key Scheduling DONUT has variable key length. Since two decorrelation 
modules need four 128-bit subkeys and every G function needs two 32-bit(4 
bytes) subkeys, we need to generate twenty eight 32-bit subkeys. Our design 
strategy of key schedule is to prevent someone from finding some round keys 
from another round keys. 

In the following let G{X, Y, Z) be a G- function with input X , the first addi- 
tion key Y, and the second addition key Z. The notation «n means n-bit left 
shift and <<< n(>>> n) means n-bit left(right) rotation. Let h{> 16) be the 
byte number of key length and ufc[0], ufc[l], • • • , — 1] be a user-supplied key. 

Then the following is the key schedule of DONUT: 

— Input : uk[0],uk[l],- ■ ■ , uk[b — 1]. 

— Output : k[0],k[l] ■ ■ ■ ,k[27]. 

1. for( i = 0;i < 112; f -I- -b) L[i] = uk[i%b]; 

2. for( i = 5;i< 109; i + -b) L[{\ = S[L[i]] © S[L[i - 5]] © S[L[i + 3]]; 

3. X[0] = 0x9e3779&9; 

4. F[0] = 0x67el5163; 

5. for( i = 0;i < 28;i + +) do the followings: 

(a) T\i] = L[4i] I (L[4i + 1] « 8) | (L[4i + 2] « 16) | {L[4i + 3] « 24); 

(b) X[z + 1] = G{X[i],{T[i] »> 7), {T[i] «< 5)); 

(c) F[* + 1] = G(F[*], (T[z] «< 13), (T[z] »> 9)); 

(d) K[i] = X[i + 1] © Y[i + 1] © T[i]; 

4 Difference Distribution Attack 

In this section we suggest an attack called difference distribution attack(DDA) . 
This attack is a chosen plaintext attack and uses the distribution of output dif- 
ferences when the input difference is fixed. In Fig. 4, if the input difference of 
DONUT is fixed, then the output differences are non-uniform with probability 
2“^®. When this occurs, we can say that the input difference of second decorre- 
lation module is of the form (a, 0). 

In the G function, since XOR distribution table of S-box is 4-uniform, all 
components of XOR distribution table is bounded by 4. So the probability of all 
a b is bounded by 2“® where a and b are 8-bit difference and the probability 
of c -S d is bounded by (2“®)^ = 2“^"^ where c and d are 16-bit difference. So 
the probability of a /3 is bounded by (2“^^)^ = 2“^®. This means that if we 
consider 2^® input differences, then we can find the input difference such that 
a p. Consider 2®^ input pairs with above input difference. Since minimum 
non-zero probability of a /3 is 2“®® and we consider 2®^ input pairs, at least 
two pairs have the same output difference. 

The attack starts as follows. First, choose an input difference A and choose 
2®^ plaintext pairs with difference A. Obtain the corresponding ciphertext pairs 
and determine whether the same output differences exist. This occurs with prob- 
ability 2“^®. So if we consider 2®® input differences, then we can obtain an input 
difference A such that the output differences have the same value, say A' . When 
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Z\ 




Fig. 4. Difference of Frame Structure 



this occurs, we can say that the input difference of the last decorrelation module 
is of the form (a, 0). For all (a, 0), we compute the corresponding key A 2 such 

that (a, 0) and store them. Second, as above, choose another input 

difference such that the output differences have the same value, say Z\". For all 
(a, 0), compute such that (a,0) . For the same (a,0), the right 

key has the same A 2 value. This attack needs 2®^ storage and 2 x 2^® x 2®^ = 2®^ 
chosen plaintext pairs. 



5 Structure of Improved DONUT 

In this section we describe the structure of improved DONUT. It has two 
differences between improved DONUT and original DONUT. 



Ftame Structure Improved DONUT consists of 5 rounds. The first round and 
the fifth round consist of pairwise perfect decorrelation modules Ai ■ y + Bi and 
A 2 ■ y + B 2 - The inner three rounds consist of three round Feistel permutations. 
Fig. 5 shows the frame structure of improved DONUT. The inner functions F 
and G are the same as DONUT. 
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Key Scheduling Since improved DONUT needs four 128-bit subkeys and every 
G function needs two 32-bit (4 bytes) subkeys, we need to generate thirty four 
32-bit subkeys. The following is the key schedule of improved DONUT: 

— Input : ufc[0], • • • , uk\b — 1]. 

— Output : fc[0], /c[l] • • • , fc[33]. 

1. for( I = 0; i < 136; i -I- -f) L[i] = uk[i%h]\ 

2. for( i = 5; z < 133; i + +) L[{\ = S[L\i]] © S[L[i - 5]] © S[L[i + 3]]; 

3. X[0] = 0a;9e377969; 

4. T[0] = 0a;67el5163; 

5. for( z = 0; z < 34; z + +) do the followings: 

(a) T\i] = L[4i\ I (T[4i + 1] « 8) | (L[4z + 2] « 16) | (T[4i + 3] « 24); 

(b) X[i + 1] = G{X[i], (T[z] »> 7), (T[z] «< 5)); 

(c) Y[i + 1] = G(T[z], (T[z] «< 13), (T[z] »> 9)); 

(d) K[i] = X[i + 1] © Y[i + 1] © T[z]; 



6 Security of Improved DONUT 

6.1 Resistance against DC and LC 

In this section, we estimate the security of improved DONUT against DC 
and LC. 

Definition 3. For any given Ax,Ay,rx,Fy G Z™, the differential and linear 
probability of each S-box are defined by 



DP^{Ax Ay) = 



#{x G Z™|S'(a;) © S{x © Ax) = Ay} 



and 



LP^{Fx Fy) = 



#{x G Ijlfi\rx ■ X = Fy ■ 5'(x)} 



Im—l 



- 1 



where Fx ■ x denotes the parity of bitwise XOR of Fx and x. 



Definition 4. The maximal differential and linear probability of S-box are de- 
fined by 



and 



DPff = max DP* {Ax — > Ay) 

Ax^0,Ay ^ ’ 



I^Pmax — inax LP^{Fx—^Fy), 
rx,ry^o ^ ’ 



respectively. 



Since S'-box in the G function of improved DONUT is 8 x 8 and differentially 
and linearly 4-uniform, DPff^^ = 2“® and LP^^^ = 2“®. Not only the S-box 
but also the diffusion is an important factor to give the provable security of the 
SDS function. 
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Now the minimum number of differentially and linearly active 5'-boxes of the 
SDS function are defined by 

nd{D) = min {Hw{Ax) + Hw{Ay)} 

Ax^O 

and 

ni{D) = min {Hw{r x) + Hw{ry)}, 
ry^o 

respectively [11]. 

Theorem 1. [6] Let M be the n x n matrix representing a diffusion layer D. 
Then nd{D) = n + 1 if and only if the rank of each k x k submatrix of M is k 
for all 1 < k < n. 

Since we construct the 4x4 matrbc D which satisfies the theorem 1, the diffu- 
sion layer used in improved DONUT is maximal with nd{D) = 5 and ni{D) — 5. 
Therefore, we can give the provable security of the function G with the following 
two theorems. 



Theorem 2. [6] Assume that the round keys, which are XORed to the input data 
at each round, are independent and uniformly distributed. Ifrid{D) = n-|-l, then 
each differential probability of SDS function is bounded by 

Theorem 3. [6] Assume that the round keys, which are XORed to the input 
data at each round, are independent and uniformly distributed. Ifr>i{D) = n+1, 
then each linear probability of SDS function is bounded by 



If we assume that the round keys of improved DONUT are independent 
and uniformly distributed, then we can obtain that the maximal differential 
and linear probability of G function are bounded by The F function con- 
sists of 3- round Skipjack-like structure and the maximal differential and linear 
probability of F function are bounded by = 2“®°. Aoki and Ohta[l] 

showed that the differential probability of 3-round Feistel structure is bounded 
by DP^^„, when the maximum differential probability of inner function is bound 
by DPmax- So we can conclude that the inner 3- round Feistel structure of im- 
proved DONUT has the maximal differential and linear probability bounded by 

((2-30)2)2 ^ 2-120. 

Let M = Mg be a message space where Mg = {0, 1}®^ has a group struc- 
ture. Since improved DONUT uses three round Feistel permutations as round 
functions and decorrelation module Ai-y + Bi is very key dependent [7], the max- 
imum probability of improved DONUT is 2-i20. This case occurs only when the 
input difference of second round F is zero. For a given nonzero input difference of 
decorrelation module, the probability that the input difference of second round 
F is zero is 2 “®^. As a point of view of attacker, he must find the characteristic 
with probability higher than 2 “ 12® in order to attack improved DONUT. But 
this occurs with probability 2“®^ and though he can find the characteristic with 
probability 2-12°, must find the key A\ and A 2 with computational complex- 
ity 2 • 2-12® field multiplication, because he cannot know the input difference of 
inner round Feistel permutations. Therefore we can expect to obtain a higher 
resistance against differential and linear cryptanalysis with only five rounds. 
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6.2 Resistance against Boomerang Attack 

In this section we introduce the boomerang attack and show that improved 
DONUT is secure against boomerang attack. 

Boomerang attack was introduced by D. Wagner[14]. It is a differential attack 
that attempts to generate a quartet structure at an intermediate value halfway 
through the cipher. If the best characteristic for half of the rounds of the cipher 
has probability q, then the boomerang attack can be used in a successful attack 
needing chosen texts. The attacker considers four plaintexts P, P' , Q, Q' , 

along with their respective ciphertexts C,C',D,D'. Let E{-) represent the en- 
cryption operation, and decompose the cipher into E = Ei o Eq, where Eq 
represents the first half of the cipher and Ei represents the last half. We will use 
two differential characteristics, A ^ A* for Eq, as well as V — > V* for E^^. 

The attacker wants to cover the pair P, P' with the characteristic for Eq, and 
to cover the pairs P, Q and P', Q' with the characteristic for Then the pair 
Q, Q' is perfectly set up to use the characteristic A* ^ A for E^^ as follows: 

Eq{Q) © Po(Q') = E^{P) © Po(P') © E^{P) © Po(Q) © Eo{P') © Po(Q') 

= Eo{P) © Eo{P') © E^\C) © E^\D) © E^\C) © E^\D') 
= A* © V* © V* 

= A*. 

We define a right quartet as one where all four characteristics hold simultane- 
ously. The only remaining issue is how to choose the texts so they have the right 
differences. We can get this as follows. First, we generate P' = P © A, and 
get the encryptions C, C of P, P' with two chosen-plaintext queries. Then we 
generate P, P>' as P = C © V and P' = C" © V. Finally we decrypt P, D' to 
obtain the plaintexts Q, Q' with two adaptive chosen-ciphertext queries. 

Let Mq be the first pairwise perfect decorrelation module of improved DONUT 
and let M\ be the second one. Let Eq be the first two rounds Feistel permutations 
of improved DONUT and let Pi be the last one round Feistel permutation of 
improved DONUT. Take Eq = Eqo Mq and Ei = M\ o Ei. First, Eq has charac- 
teristic with probability bounded by So Eq has no good characteristic for 
boomerang attack. Second, has characteristics with probability I as follows: 

(0,c)^ (c,0) 

where c is a nonzero 64-bit value and 0 is a 64-bit zero value. But we can get 
a characteristic (a, 6) — > (0, c) through with probability 2“®^ because we 
cannot know the subkey of M\ . So we cannot apply the boomerang attack to im- 
proved DONUT and improved DONUT is secure against boomerang attack. Mq 
plays an important role when the number of rounds of Eq and Pi are exchanged. 

7 Conclusion 

In this paper we suggested the difference distribution attack(DDA) and ap- 
plied this attack to DONUT. We also suggested an improved DONUT which 
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consists of five rounds, which uses two pairwise perfect decorrelation modules 

and 3 rounds Feistel permutations. We showed that improved DONUT was se- 
cure against conventional DC, LC, boomerang attack. 
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Abstract. The absolute indicator for GAC forecasts the overall 
avalanche characteristics of a cryptographic Boolean function. From a 
security point of view, it is desirable that the absolute indicator of a 
function takes as small a value as possible. The first contribution of this 
paper is to prove a tight lower bound on the absolute indicator of an mth- 
order correlation immune function with n variables, and to show that a 
function achieves the lower bound if and only if it is affine. The absolute 
indicator for GAC achieves the upper bound when the underlying func- 
tion has a non-zero linear structure. Our second contribution is about 
a relationship between correlation immunity and non-zero linear struc- 
tures. The third contribution of this paper is to address an open problem 
related to the upper bound on the nonlinearity of a correlation immune 
function. More specifically, we prove that given any odd mth-order corre- 
lation immune function / with n variables, the nonlinearity of /, denoted 
by Nf, must satisfy Nf < — 2™^+^ for |n — 1 < m < 0.6n — 0.4 or 

/ has a non-zero linear structure. This extends a known result that is 
stated for 0.6n — 0.4 < m < n — 2. Keywords: Correlation Immunity, 
Absolute Indicator, Nonlinearity, Linear Structures, Stream Ciphers 



1 Introduction 

Correlation immunity has long been recognized as one of the critical indicators 
of nonlinear combining functions of shift registers in stream generators (see [10]). 
A high correlation immunity is generally a very desirable property, in view of 
various successful correlation attacks against a number of stream ciphers (see 
for instance [5]). 

Another class of cryptanalytic attacks against stream ciphers, called best 
approximation attacks, were advocated in [3] . Success of these attacks in breaking 
a stream cipher is made possible by exploiting the low nonlinearity of functions 
employed by the cipher, and it highlights the significance of nonlinearity in the 
analysis and design of encryption algorithms. 

However it should be pointed out that correlation immunity is not harmo- 
nious with some other cryptographic requirements. In particular, high correlation 
immunity may introduce weaknesses in terms of a low algebraic degree, a small 
avalanche degree and a low nonlinearity and so on. This can be seen, for instance, 
from recent work in [14,15]. 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 49-63, 2001. 
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GAC is a nonlinearity indicator introduced in [11] to study the global or 
overall avalanche characteristics of a cryptographic function. Two different indi- 
cators were proposed to measure numerically the GAC of a functions, namely, 
the sum-of-squares indicator and the absolute indicator. A small value for the 
absolute indicator of a function is generally more desirable. 

In the first part of this paper we show that functions with a high order 
correlation immunity necessarily has weaknesses in its avalanche characteristics. 
More specifically, we prove that if / is a balanced mth-order correlation immune 
function with n variables, then the absolute indicator for GAC of /, denoted 
by Af, satisfies Ay > 2™ For an unbalanced function /, we show 

that Af > We further investigate the tightness of the 

lower bounds and identify a necessary and sufficient condition on when the two 
lower bounds are achieved. 

When Af = 2", / must have a non-zero linear structure, which is consid- 
ered cryptographically undesirable. In the second part of this paper, we employ 
correlation immunity to characterize Boolean functions having non-zero linear 
structures. 

Recently, Zheng and Zhang [14] have proved that if / is an mth-order 
correlation immune function / with n variables, then its nonlinearity satisfies 
Nf < 2 "“^ — 2 '"+^, when 0.6n — 0.4 < m < n — 2 , regardless of the balance 
of the function. Note that the inequality Nf < 2 ”“^ — 2 '"+^ does not hold for 
m = n — 1. Fortunately, this is a trivial case, as an (n — l)th-order correla- 
tion immune function / with n variables must be affine. In the same paper, 
Zheng and Zhang have also shown that the equality holds if and only if / is 
a plateaued function. The authors leave as an open problem for the case of 

— 1 < m < 0.6n — 0.4. This open problem is addressed in the third part of 
this paper. In particular, we prove that the inequality Nf < 2"“^ — 2’”+^ does 
hold for odd m with hn — 1 < m < 0.6n — 0.4 otherwise / has a non-zero linear 
structure. This brings us a step closer to finally solving the open problem. 



2 Boolean Functions 

We consider functions from Vn to GF{2) (or simply functions on R„), where 
is the vector space of n tuples of elements from GF{2). The truth table of a 
function / on is a (0, l)-sequence defined by (/(oo), ■ ■ ■ , /(a 2 "-i)), 

and the sequence of / is a (1, — l)-sequence defined by ((— (— 

where ao = (0, ...,0, 0), ai = (0,...,0,1), . . ., a 2 »-i-i = 
1). The matrix of / is a (1, — l)-matrix of order 2" defined by M = 
((— where © denotes the addition in GF(2). 

Given two sequences a = (oi, • • • , am) and b = (6i, • • • , bm), their component- 
wise product is defined by a * 6 = (ai6i, • • • , Ombm)- In particular, if m = 2" and 
d, b are the sequences of functions / and g on respectively, then d*b is the 
sequence oi f ® g where © denotes the addition in GF{2). 

Let d = (ai,---,am) and b = (bi,---,bm) be two sequences or vectors, 
the scalar product of d and b, denoted by (a, 6), is defined as the sum of 
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the component-wise multiplications. In particular, when a and b are from Vm, 
{a, b) = ai6i©- • -(Bambm, where the addition and multiplication are over GF(2), 
and when d and b are (1, — l)-sequences, (a, b) = where the addition 

and multiplication are over the reals. 

An affine function / on is a function that takes the form of f{xi , . . . , x„) = 
aiXi © • • • © QnXn © c, where aj,c G GF(2), j = 1,2, ... ,n. Furthermore / is 
called a linear function if c = 0. 

A (1, — l)-matrix N of order n is called a Hadamard matrix if = nin, 

where N'^ is the transpose of N and /„ is the identity matrix of order n. A 
Sylvester-Hadamard matrix of order 2", denoted by is generated by the 
following recursive relation 



Ho = 1 , 



Hn—l Hji—\ 
Hn—1 H^i — l 



n=l,2,.... 



Obviously Fin is symmetric. Let li,D <i < 2” — 1, be the i row of iL„. It is known 
that li is the sequence of a linear function Lpi{x) defined by the scalar product 
(pi{x) = (ai,x), where at is the ith vector in V„ according to the ascending 
alphabetical order. 

The Hamming weight of a (0, l)-sequence denoted by HW{f), is the num- 
ber of ones in the sequence. Given two functions / and g on Vn, the Hamming 
distance d{f, g) between them is defined as the Hamming weight of the truth 
table of f{x) © g{x), where x = {xi, . . . , x„). 

A function / is said to be balanced if its truth table contains an equal number 
of ones and zeros. 



3 Cryptographic Criteria of Boolean Functions 

The following criteria for cryptographic Boolean functions are often considered: 
balance, nonlinearity, avalanche criterion, correlation immunity, algebraic degree 
and non-zero linear structures. In this paper we focus mainly on nonlinearity and 
correlation immunity. 

The so-called Parseval’s equation (Page 416 [6]) is a useful tool in this work: 
Let / be a function on Vn and f denote the sequence of /. Then = 

2^" where ti is the ith row of z = 0, 1 , . . . , 2" — 1. 

The nonlinearity of a function / on Vn, denoted by Nf, is the minimal Ham- 
ming distance between / and all affine functions on Vn, i.e., 

^ i=i “'V+i 

where Lp\, (p 2 , . . ., ip 2 '^+^ Eire all the affine functions on P„. High nonlinearity is 
useful in resisting a linear attack and a best approximation attack. The following 
characterization of nonlinearity will be useful (for a proof see for instance [7]). 

Lemma 1. The nonlinearity of f on Vn can be expressed by 
Nf = 2”-i - ^ max{|(e,^i)|,0 < z < 2" - 1} 
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where ^ is the sequence of f and £q, ■ ■ ■, ^ 2 »-i are the rows of Hn, namely, the 
sequences of linear functions on Vn- 

From Lemma 1 and Parseval’s equation, it is easy to verify that N f < 2"“^ — 
25 n-i function / on P„. If iV/ = 2”“^ — 25”“^, then / is called a hent 

function [8]. It is known that a bent function on Vn exists only when n is even. 

Let / be a function on Vn- For a vector a G Vn, denote by ^(a) the sequence 
of f{x © a). Thus ^(0) is the sequence of / itself and ^(0) * ^{a) is the sequence 
of f{x)(Bf{x(Ba). Set Z\y(a) = (^(0), ^(a)), the scalar product of ^(0) and ^(a). 
Z\(of) is called the auto-correlation of / with a shift a. We omit the subscript of 
A f (a) if no confusion occurs. Obviously, A{a) = 0 if and only if f{x) © /(x © a) 
is balanced, i.e., / satisfies the avalanche criterion with respect to a. In the 
case that / does not satisfy the avalanche criterion with respect to a vector a, 
it may be desirable for f(x) © f{x © a) to be almost balanced. That is, one 
may require |Z\(a)| to be a small value. In an extreme case, a G 14, is called 
a linear structure of / if |Z\(a)| = 2” (i.e., f{x) © f{x © a) is a constant). For 
any function /, A{ao) = 2”, where oq is the zero vector on Vn- It is easy to 
verify that the set of all linear structures of a function / form a linear subspace 
of Vn, whose dimension is called the linearity of f, denoted by L/. A non-zero 
linear structure is cryptographically undesirable hence we should avoid non-zero 
linear structures in the design of cryptographic functions as possible as we can. 
It is also well-known that if / has non-zero linear structures, then there exists a 
nonsingular n x n matrix B over GF(2) such that f{xB) = g{y) © 'ip{z), where 
X = {y,z), y G Vp, z G Vq, g is a, function on Vp that has no non-zero linear 
structures, and is a linear function on Vq- 

The concept of correlation immune functions was introduced by Siegenthaler 
[10]. Xiao and Massey gave an equivalent definition [1,4]: A function / on Vn is 
called a mth-order correlation immune function if 

^ /(x)(-l)<^’">=0 

x£V„ 



for all P &Vn with 1 < HW{P) < m, where in the the sum, f{x) and (P,x) are 
regarded as real- valued functions. From the first equality in Section 4.2 of [1], 
a correlation immune function can also be equivalently restated as follows: Let 
/ be a function on Vn and let f be its sequence. Then / is called a mth-order 
correlation immune function if {£,,£) = 0 for every i, where £ is the sequence 
of a linear function Lp{x) = {a, x) on 14 constrained by 1 < HW (a) < m. In 
fact, (44) = 0; where £i is the ith row of iL„, if and only if f{x) © (ai,x) is 
balanced, where is the binary representation of an integer f, 0 < i < 2" — 1. 
Correlation immune functions are used in the design of running-key generators 
in stream ciphers to resist a correlation attack and the design of hash functions. 
Relevant discussions on correlation immune functions, more generally on resilient 
functions, can be found in [12]. 
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4 A Tight Lower Bound on the Absolute Indicators of 
Correlation Immune Functions 

Let / be a function on and ^ denote the sequence of /. We introduce two 
new notations: 

1. Set 9/ = {i I (^, £i) yf 0, 0 < z < 2” — 1} where 4 is the zth row of 

2. set 3^ = {oj I 0, 0 < z < 2” — 1} where is the binary 

representation of an integer z, 0 < z < 2" — 1 and t'o,. is identified with 

is essentially the same as 9/ with the only difference being that its ele- 
ments are represented by a binary vector in Vn- We will simply write 9/ as 9 
and as 3* when no confusion arises. It is easy to verify that #3/ and #9} 
are invariant under any nonsingular linear transformation on the variables of the 
function /. #9/ (#9}) together with the distribution of 9/ (9^ determines 
the correlation immunity and other cryptographic properties of a function. 

Lemma 2. Let f be a function onVn, (3 he a vector in Vn and B he a nonsingular 
n X n matrix over GF(2). Then the following statements hold: 

(i) Set g{x) = f{xB (B ff). Then . 

(ii) Set g{x) = f{x © j3). Then 9* = 

(Hi) Set g{x) = f{xB). Then S* = where XB^ = G X}. 

(iv) Set g{x) = f{x) © (fi{x), where (p{x) = {(3,x). Then 9* = /3 © SJ where 
X = {/3©7|7GX}. 

Proof. Since (ii), (iii) and (iv) together imply (i), we prove (ii), (iii) and (iv) 
only. 

(ii) a G 9* "1=^ g{x)(B{a, x) is unbalanced, i.e., /(x©/3)©(a, x) is unbalanced 
-1=^ f{x © /?) © (a, X © P) is unbalanced <1=^ f{u) © {a, u) is unbalanced where 
u = x(B P a G SJ. This proves S* = 9^. 

(iii) a G 9* ^1=^ g{x) © (a,x) is unbalanced, i.e., f{xB) © (a,x) is unbal- 
anced <1=^ /(zz)©(a, uB~^) is unbalanced where xB = u. Note that (a, uB~^) = 
{uB~^)a^= u{B~^a^) = {B~^a^)'^u^ = a{B'^)~^u^ = {a{B'^)~^ ,u) . There- 
fore f{u) © (a, uB~^) is unbalanced f{u) © {a{B'^)~^ ,u) is unbalanced 
a(B^)-i G 9} a G Q}B^. This proves 

(iv) a G 9* "1=^ g{x) © (a,x) is unbalanced, i.e., f{x) © (P,x) © (a,x) is 
unbalanced <1=^ /(x)© (/3©a, x) is unbalanced <1=^ /3©a G <1=^ a G /3©9j. 
This proves 9* = /3 © 3^. 

The following definition is from [11]. 

Definition 1. For a function f on Vn, the absolute indicator for GAC of f is 
defined as 

^f= max |Z\(a)| 

aGVn,ct7^0 
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Obviously = 2" if and only if / has a non-zero linear structure, while 
Af = 0 if and only if / is bent. Since balanced functions are not bent, we 
have Af > 0 where / is balanced. In designing cryptographic algorithms, we 
are concerned with a balanced nonlinear function / that shows a small Af, as 
was discussed in [11] where it was argued that a smaller Z\/ is cryptographically 
more desirable. This section shows that a high order of correlation immunity 
may result in weaknesses in avalanche characteristics. 

The following lemma is the re-statement of a relation proved in Section 2 
of [2]. 

Lemma 3. For every function f on Vn, we have 

, A{ai ) , . . . , Z\(a 2 "-i))^^ra = ((^,4)^, • ■ • > 

where f denotes the sequence of f and ii is the ith row of Fin, i = 0, 1, . . . , 2” — 1. 

From [14], we have the following statement. 

Lemma 4. Consider a function f on Vn- Let ^ = (oq, oi, . . . , a 2 '»-i), where 
Qj = ±1 denote the sequence of f and C denote the ith row of Fin, i = 
0,1,..., 2" — 1. Let p be an integer with 1 < p < n — 1. Write f = 
(Co) Cl) • • ■ ) '? 2 P-i) where each Ci is of length 2"“^. Let Ci denote the ith row 
ofHn-p, i = 0,l,...,2"-P-l. 



2^((Co, Cj), (Cl) Gj), (C 2 , e^), . . . ) (C 2 P-I) ej)) 

= ((C) ^j), (C) ^j+2’'-p), (C) •^j-|-2.2'*-p)) • • • ) (C) ■^j + (2P-l)2"-p))-f^p 

where j = 0, 1, . . . , 2”“^ — 1. 

The following lemma is useful in proving one of our main theorems. 

Lemma 5. Let {ko,ki,. . . ,k 2 p-i)Hn = (ro, ri, . . . , r 2 »-i), where ko = 0 and 
each kj and each rj are both real numbers. Then 

max{|ri|, . . . , |r 2 n_i|} > max{|fci|, . . . , |fc 2 »-i|} 



Proof. Without loss of generality, we assume that |A:2n-i| = 

max{|fci|, . . . , |/c2"-i|}- Let Hn = [P Q] where both P and Q are 2” x 2”“^ 
matrices. Hence we have {ko, k\, . . . , fc2"-i)Q = (r2n-i , r2n-i_|_i, . . . , r2~_i). Let 
Co denote the all-one sequence of length 2”“^. It is obvious that 



{ko,ki,...,k2p-i)Qeo = (r2n-i , r2n-i+i, . . . , r2"-i)£ 



( 1 ) 



Note that Q = 



and hence we have Qe^ = 2” ^{bo, 61 , ... , 62»_i) 



Hn-l 
-Hn-i_ 

where (60, &i, . . . , 62'>-i) satisfies bo = 1 , &2"-i = £^nd other bj = 0. Due 

to ( 1 ), we have 2”“^(fco — k 2 P~i) = X)j=2"-i where ko = 0. This proves that 

there exits some to, 2 ”“^ < to < 2 ” — 1 , such that |rip| > |fc2»-i|- Thus the 
lemma holds. □ 
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We notice that max{|ro|, |ri|, . . . , |r 2 --i|} > max{|fco|, |fci|, • ■ • , |fe"-i|} is 
still true. However this inequality is less useful in this paper as A{ao) = 2" 
holds for every function on Vn, and we are concerned with Af where Af = 
maxo,gy„,„^o l^(a)l- 

Theorem 1. Let f be a function on Vn- Then the following statements hold: 

(i) If there exist an m-dimensional linear subspace W, 1 < m < n — 1, and a 
vector a* in Vn such that fl {a* © W) = 0 where 0 denotes the empty set, 
then 

+ 00 

Af>2^Y^ (2) 

i^O 

(a) Under the assumption of (i), the following three statements are equivalent: 

(a) Z\/ = 

(b) m = n — 1, 

(c) f has a non-zero linear structure. 

Proof. First we prove (i). Due to Lemma 2, we can assume, without loss of 
generality, that a* = ao, where oq denotes the zero vector in Vn, and W = 
{ao, ai, . . . , Let f denote the sequence of / and £i be the tth row of iJ„, 

i = 0,1,..., 2” - 1. Since 9} n W = 0, we have (C,^i) = 0, t = 0, 1, . . . , 2™ - 1. 
Due to Lemma 3, we have 

(A{ao),A{ai), ... , A{a2^-i))Hn = (0, . . . , 0, (^,^ 2 ™)^) • • ■ , 
or 

(0, . . . , 0, i2^-if)Hn = 2"(Z\(ao), L\(ai), . . . , Z\(a2»-i)) (3) 

Applying Lemma (4) (with p = n — m and j = 0) to Equation (3), we obtain 

2-2^-l 2^-1 

(0, ^ {f, Y. 

j=2”“ j=2"-2™ 

= 2"(Z\(ao)> A{a 2 m), A{a 2 . 2 '^), ■ • • , L\(a( 2 '*-”»-i). 2 ™) (4) 

Applying Parseval’s equation to /, we have X)i=i = 2^". 

It is easy to see that there exists some Zq, 1 < zq < 2"“'" — 1, such that 

po + l)-2--l 2n +“ 

W (r ^ -\2 > = 2"+™ V 

j—io-2‘^ i—0 

Applying Lemma 5 to (4), we conclude that there exists some jo, 1 < jo < 
2 n-m _ such that 

+ 00 

2”|A(aj„.2-)| > 2"+"^ ^ 

2 = 0 
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This proves that Z\/ > 2*” hence (i) holds. Next we prove 

(ii). 

First we prove (a) (b). Assume that (a) holds, i.e., Af = 

or equivalently, Af = 2 ™ There- 
fore must be an integer. Since 2” is not divisible by 2"“™ — 1 if 

n — m > 2, we conclude that m = n — 1, i.e., (b) holds. Conversely, assume 
that (b) holds, i.e., to = n — 1. In this case, by using (i) of the theorem, we have 
A/ > 2™ = 2”. Hence A/ = 2”, i.e., (a) holds. 

We now prove (b) (c). Assume that (b) holds, i.e., to = n— 1. In this case, 

by using (i) of the theorem, we have A/ > 2™ XS 2*^™“”) = 2 n-m_i = 2”. 
Hence Af = 2". This means that / has a non-zero linear structure and hence (c) 
holds. Conversely, assume that (c) holds, i.e., / has a non-zero linear structure. 
Due to Lemma 2, without loss of generality, assume that a 2 "-i is ^ non-zero 
linear structure. Hence we can write / as f{x) = cx\ © g{y) where g is a function 
on Vn-i, X = (xi, . . . , Xn), jj = {x2, ■ ■ ■ ,Xn) and c is a constant in GF(2). 
Once again, due to Lemma 2, without loss of generality, assume that c = 0. 
Let r] denote the sequence of g. Then the sequence ^ of / can be denoted as 
? = {v,v)- It is easy to verify that = 0 where it is the ith row of iJ„, 

1 = 0,1,..., 2”“^ — 1. This proves that 0 IT = 0 where W is specialized as an 
(n — l)-dimensional subspace, that is, IT = {oq, «i, . . . , 2”“^ — 1}. This proves 
that TO = n — 1 and hence (b) holds. □ 

From the definition of correlation immune functions [1,4], if / is a balanced 
TOth-order correlation immune functions, then to < n — 1, and a function on T„ 
is (n— l)th-order correlation immune if and only if f{x) = xi ©• • - ©a;„©c where 
X = {x\, . . . , Xn) and c is a constant in GF{2). Using Theorem 1, we obtain 

Theorem 2. Let f be a balanced mth-order correlation immune function on 
Vn (\ < m < n — Then 



+ 00 

A/ > 2™ ^ 

i=0 

where the equality holds if and only if f{x) = © • • • © © c where x = 

{xi, . . . ,Xn) and c is a constant in GF(2). 

Let / be a function on T„ whose sequence is f. Assume that / satisfies 
{f, if) = 0 for every t = 1, . . . , 2” — 1, or equivalently, f(x) © (a, x) is balanced 
for every non-zero vector in T„. It is easy to verify that / must be a constant 
in GF(2). For this reason, we define the zero function on V„ and the non-zero 
constant function on T„ as an nth-order correlation immune function on V„. 

Theorem 3. Let f be an unbalanced mth-order correlation immune function 
on Vn (2 < m < n). Then 



Af > 2""-i 



+ 00 

^ ^ 2i{m—l — n) 
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where the equality holds if and only if f is a constant. (Note that an nth-order 
correlation immune function is defined as a constant). 

Proof. Let [3 €Vn and HW{(3) = m. Set ip{x) = (f3,x) and g = f(B ip. It is easy 
to see that g is a balanced (m — l)th-order correlation immune function on 
Due to Theorem 2, the statement holds. □ 

Theorems 2 and 3 indicate that correlation immunity is not harmonious with 
avalanche characteristics. 

5 A Relationship between Correlation Immnnity and 
Linear Structures 

In this section, we consider the case when the absolute indicator for SAC achieves 
the maximum value i.e., = 2". 

Theorem 4. Let f be a function on Vn. If there exist a p-dimensional linear 
subspace W with 1 < p < n — 1 and a vector a in V„ such that C a © IT if 
and only if f has a non-zero linear structure. 

Proof. We first prove the necessity. Since the existence of non-zero linear struc- 
tures is invariant under a nonsingular linear transformation on the variables, 
without loss of generality, we can assume IT = {(oi, . . . , Op, 0, . . . , 0)|(ai, . . . , Op, 
0,...,0) G Vn}. In other words, IT = {oq, a 2 n-p, o; 2 . 2 "-r, ■ • • , Q;( 2 p_i). 2 "-p}, 
where each aj G T„ and aj is the binary representation of an integer j. Let 
IT* = {(0,..., 0,C1, ...,Cn-p) 1(0, . . . ,0,C1, . . . ,c„_p) G Vn}. In other words, 
IT* = |ao) cei, . . . , q; 2 "-p-i}) where each aj G is the binary representation 
of an integer j, j = 0, 1, . . . , 2”“^, and 

Vn — (oq 0 IT) U (oi 0 IT) U * * * U (q;2p — 0 IT) 

where {aj © IT) fl (oi © IT) = 0 whenever j yf i. 

Since C aj„ © IT for some jo, 0 < jo < 2”“P — 1, (^, £i) = 0 if Oj G Oj © IT 
with j yf Jo, where is the representation of an integer i. Note that Oj G Oy ©IT 
if and only if t G {j, j + 2”“P, . . . , j + {2^ — 1)2”“^}. By using Lemma 4, we 
have 

((^0, ^i)), (^1, ^i)), • ■ • , (^2P — 1, 

= ((C, (■?, ■^i-|-2"-p), • ■ • , (C) •^i-|-(2P-l)2'‘-p)) = (0, 0, ■ • ■ , 0) 

whenever i yf jo . Therefore 

((Co,ei),(6,e*),...,(6p-i.ei)) = (0,0,...,0) (5) 

whenever i yf jo. Since {fo,ei) = 0, whenever i yf jo, we conclude ,Jo = ^oCjo 
where bo = ±1. Similarly = biCj^ where 6i = ±1, . . ., £, 2 p-i = b 2 P-\ej^ where 
& 2 P -1 = ±1- Therefore the sequence of /, satisfies 



i — (boejo^biejg, . . . , b2P-iejJ 



( 6 ) 
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Since is a row of Hn-p, ej^ is the sequence of a linear function on Vn-p, 
denoted by ip. Let (bo, bi, . . . , &2P-i) be the sequence of a function on Vp, denoted 
by g. Due to (6), / can be expressed as f(x) = g(y) © ip(z) where x = (y,z), 
y G Vp, z € Vn-p. This proves that / has a non-zero linear structure. 

Conversely, assume that / has a non-zero linear structure. Then / is equiv- 
alent to g(x) = cxi © h(y) under a nonsingular linear transformation on the 
variables, where h is a function on C„_i, x = (xi, . . . , x„) and y = (x2, ■ ■ ■ , a;„). 
Without loss of generality, assume that c = 0. Let denote the sequence of 
g and y denote the sequence of h. Then = (77,77). Obviously, if 4 satisfies 
£i = (e, — e), where £i denotes the ith row of and e is a row of Hn-i, we have 
= 0- Therefore if {^',£j) yf 0 then £j must take the form of £j = (e, e). 
Due to the structure of Hn, j satisfies 0 < j < 2"“^ — 1. This proves that 
Q'g C {0, 1 , . . . , 2”“^ — 1}, equivalently, C IT = {oq) cti, ■ ■ ■ , <a2"-i-i}) where 
W obviously is an (n — l)-dimensional subspace of Vn- Since the linearity is 
invariant under any nonsingular linear transformation on the variables, we have 
the same conclusion on 3^. Thus we have proved the sufficiency. □ 

Theorem 4 can be viewed as a way of characterizing Boolean functions having 
non-zero linear structures by the use of correlation immunity. This result will be 
used in the next section. 

6 A New Result on Upper Bound on Nonlinearity of 
Correlation Immune Functions 

6.1 Previously Known Results 

Recently Zheng and Zhang proved that when 0.6n — 0.4 < m < n — 2, the 
nonlinearity iVy of an rTrth-order correlation immune function / with n variables 
satisfies the condition of Nf < 2”“^ — 2™+^. In the same paper they also showed 
that if a correlation immune function achieves the maximum nonlinearity for 
such a function, then it is a plateaued function. 

The concept of plateaued functions was introduced in [13] . Let / be a function 
on Vn and ^ denote the sequence of /. If there exists an even number r, 0 < r < n, 
such that fpfs = 2’’ and each {£,,£j)‘^ takes the value of 2^"“’’ or 0 only, where 
£j denotes the jth row of Hn, j = 0, 1 , . . . , 2” — 1, then / is called a rth-order 
plateaued function on V„. f is also simply called a plateaued function on Vn if we 
ignore the particular order r. Some facts about plateaued functions follow: if / is 
a rth-order plateaued function, then r must be even; / is an nth-order plateaued 
function if and only if / is bent: and / is a Oth-order plateaued function if and 
only if / is affine. Plateaued functions are interesting as they have a number 
of cryptographically useful properties [13]. For instance: ^ 

where the equality holds if and only if / is a plateaued function. 

We now introduce a main result in [14]. 

Theorem 5. Let f be an mth-order correlation immune function on Vn. If m 
and n satisfy the condition of0.6n — 0.4 < m < n — 2, then N f < 2”“^ — 2™+^, 
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where the equality holds if and only if f is also a 2{n — m — 2)th-order plateaued 
function. 

Note that Theorem 5 is an improvement on Sarkar and Maitra’s upper bound 
[9], Nf < 2"“^ — 2™ when m > — 1. 

The following result was given by Sarkar and Maitra [9] . 

Theorem 6. Let f be an mth-order correlation immune function on Vn, where 
m < n—2. Then {f, £) = 0 (mod 2™“*'^) where I is any row of Hn- In particular, 
if f is balanced mth-order correlation immune, then {f,£) = 0 (mod 2 ™+^). 

The following two Lemmas can be found from [14]. 

6.2 A New Result 

Lemma 4 can be generalized. Let / be a function on Vn and W be a p-dimensional 
subspace of Vn- Let U = {0, a 2 '*-p> ck 2 - 2 "-p; • . • , a( 2 P-i) 2 "-p}- Since both W 
and U are p-dimensional subspaces of Vn, we can find an n x n matrix B over 
GF{2) satisfying = U, where WB^ = {aB'^\a G W}. Set x = uB and 

g{u) = f{uB). Consider f{x) © (a,x) where a € W. Note that (a,x) = xa'^ = 
uBa^ = u{aB'^)'^ = {aB'^ , u). Therefore f{x) © {a, x) = g{u) © {aB'^ , u) where 
a € W and aB^ G U. Let g denote the sequence of g. Equivalently, we have 
(^, Ij) = ( 77 , £i) where j is the binary representation of a G W , and i is the binary 
representation of G U . 

Define a permutation tt on {0, 1, . . . , 2" — 1} as follows: 7 r(j) = t if ajB^ = ai, 
where i and j are the the binary representations of ai and aj respectively. 
Therefore 



or (C,4-i(i)) = (7) 

Rewrite g = {go, gi, , g 2 P-i) where each gi is of length 2”“^’. Applying Lemma 
4 to the function g and the subspace U, we have 

ej), (gi,ej), ( 772 , Cj), ..., (g 2 P-i,ej)) 

= (?7,'^i+2"-p), ■ • • , {v,^j-H2P-i)2p-p))Hp 

where ej denotes the jth row of Hp, j = 0, 1, . . . , 2"“P — 1. Due to (7), we obtain 
2^((?7 o, ej), (gi,ej), (g 2 ,ej), ..., ( 772 ^- 1 , e^)) 

((C) ) (^) (j+ 2 "-P)) ) • ■ • ) (C) (j + ( 2 P — 1)2"-P)))77p ( 8 ) 

where ej denotes the jth row of Hp, j = 0 , 1 , . . . , 2 ”“^’ — 1 . 

Lemma 6 . Let f be a function on Vn and f denote the sequence of f . Let q be 
an odd number with I < q < n — 2, such that 

{£.,£j) = 0 for all j such that HW{aj) < q and HW{aj) is odd 

where aj G Vn is the binary representation of integer j. Then {f.,£j) = 0 
(mod 29+2) figifjg Jqp fj^ii j HW{aj) = q 2 where aj G Vn is the binary 
representation of an integer j. 
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Proof. Let U = {0, a2"-9-i , Q!2.2"-9-i) ■ • • , o;( 2 <!+i_i) 2 n-<?-i }• Obviously [/ can be 
rewritten as U = {(ai, 02 , . . . , aq+i,0, . . . ,0)|(ai, 02 . . . , 0 ^+ 1 , 0, . . . ,0) G ¥„}■ 

Set 



W = {(ai,a2,...,a,+2,0, ...,0)|(ai,a2...,a,+2,0,...,0) G Vn, 
HW{ai,a 2 , ■ ■ ■ ,aq+ 2 ) is even} 



Since both U and W are {q+ l)-dimensional subspaces of Vn, there exists an 
n X n matrix B over GF{2) satisfying 



(i) WB"'" = U, where WB"^ = G W}, in particular, we require 

Qf(29+i_l)2"-«-i-B^ = Q;(2‘J+1 — ) 

(ii) = j = 



Set X = uB and g{u) = f{uB). Let rj = (??o, ?7i) ■ • ■ : denote the 
sequence of g, where each r]i is of length Obviously HW{a) is even for 

any a G W, i.e., HW{a) takes the values, 0, 2, 4, . . . , g — l,g+ 1. Therefore 
HW{a 2 n-q -2 (B a) must be odd, i.e., HW{a) takes the values, 1,3,5, ... ,q,q + 2. 
Note that a(29+i-i)2"-«-i = (1; • ■ • > Oj • • • > 0) and iLfL(a 2 n_ 2 "-<?-i) = q + 1. 
Obviously, 02 n -,-2 © a(29+i_i)2"-«-i = (1, • ■ •, 1, 1, 0, . . ., 0) = a(29+2_i)2n-,-2. 
Note that iLkL(a( 2 <?+ 2 _i) 2 "-<i- 2 ) = q + 2, and for any other a € W with a yf 
a(2<?+i-i)2"-9-i ) we have 1 < iLkL(a2"-9-2 © a) < q. Due to the property of 
/, (^,£j) = 0 for all j, where j is the integer representation of a 2"-«-2 © a, if 
a € W and a a( 2 «+i-i) 2 "- 9 -i ? From the properties of B, (a 2 n-q -2 © aj)B'^ = 
a 2 n-q -2 © for all Uj G W . In particular, (q; 2 "-<J “2 © a(29+i_i)2n-<?-i)^^ = 

a2"-9-2 © Q!( 2 <l + l_l)2n-9-l . 

Using (8) with j = 1, we have 



2«+^((?7o,ei),(?7i , ei), {r]2. Cl), ... , (?72<j+i-i) ei)) 

= (0, ... ,0, (^,.^(29+l_l)2n-<J-l))i?g+l (9) 

Since 2"“^“^ > 2, (?729+i-i5 ci) is even. Comparing the rightmost term in both 
sides of (9), we conclude that (^, ^(29+i-i)2'>-9-i) = 0 (mod 2^+^). By the 
same reasoning, we can prove that = 0 (mod 2"?+^) holds for all j with 

HW{aj) = q + 2. □ 



Lemma 7. Let f be a function on Vn and f denote the sequence of f. Let q be 
an odd number with 1 < q < n — 2, such that 

{£,,(. j) = 0 for all j such that HW{aj) is odd and HW{aj) < q 

where aj G U„ is the binary representation of integer j . Then either there exists 
some jo such that |(^, tjf)\ > 2®+^, or (^, £j) = 0 for all j where HW (ay) is odd. 

Proof. By using Lemma 6, {f,£j) = 0 (mod 2'?+^) holds for all j with 
HW{aj) = q + 2 where ay G U„. There exist two cases to be considered. 
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Case 1: there exists some jo with HW{ajg) = q + 2 satisfying ^ 0. In 

this case we have > 2'?+^. Thus the lemma holds. 

Case 2: = 0 holds for all j with HW{aj) = q + 2 where aj G Vn- In 

this case, we conclude that = 0 holds for all j such that HW{aj) < q+2 

and HW{aj) is odd. 

Once again we use Lemma 6. There exist two cases to be considered. 

Case 2.1: we have an integer t > 1 such that = 0 for all j where 

HW{aj) < q + 2{t — 1) and HW{aj) is odd, and there also exists jo with 
HW{ajg) = q + 2t satisfying {^,ijg) yf 0. By using Lemma 6, we can conclude 
that — 2'^+^L Thus the lemma holds in Case 2.1. 

Case 2.2: (^, £j) = 0 for all j where HW (aj) is odd. Clearly the lemma holds. 

□ 



Applying Lemma 7, we can extend Theorem 5 in the following way. 

Theorem 7. Let f be an odd mth-order correlation immune function on Vn- 
Then either Nf < 2”“^ — 2™+^ holds for |n — 1 < m < 0.6n — 0.4 or f has a 
non-zero linear structure. 

Proof. If / is balanced, Nf < 2"“^ — 2™+^ holds due to Theorem [9]. Thus we 
only need to consider the unbalanced case. From Lemma 7, there there two cases 
to be considered. Case 1: there exists some jo such that |(^, ^jg)\ > 2™+2. In this 
case, we have proved the theorem by using Lemma 1. Case 2: = 0 for all 

j where HW{aj) is odd. Set W = {a\a G V„, HW{a) is even}. Thus W is an 
(n— 1) -dimensional subspace of C„. From the property of /, obviously, 3* C W. 
From Theorem 4, / has a non-zero hear structure. □ 

Note that the nonlinearity of any Boolean function on Vn is upper-bounded 
by 2”“^ — 25"“^. For m < — 2, we have 2"“^ — 25”“^ < 2"“^ — 2"^+^. Hence 

the inequality Nf < 2"“^ — 2™+^ is trivial when m < \n — 2, although it still 
holds. For this reason, we require that m > — 1 in Theorem 7. 

Theorem 7 represents an extension of Theorem 5. The latter is stated for the 
case of 0.6n — 0.4 < m < n — 2. 

7 Conclusion Remarks 

This paper includes three main results. (1) We have presented a tight lower 
bound on the absolute indicator for GAC of an mth-order correlation immune 
function on Vn, and proved that a correlation immune function achieves the low 
bound for the absolute indicator if and only if it is affine. (2) We have established 
a relationship between correlation immunity and non-zero linear structures. (3) 
We have shown that given an odd mth-order correlation immune function / on 
Vn, the nonlinearity Nf of / satisfies Nf < 2"“^ — 2™+^ for — 1 < m < 
0.6n — 0.4 otherwise / has a non-zero linear structure. This is an extension of a 
known result that holds for 0.6n — 0.4 < m < n — 2. It would be interesting to 
known whether or not Theorem 7 can be extended to the case of an even m. 
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Some observations on upper bounds on nonlinearity for a “small” m were 
made by Sarkar and Maitra in [9]. For instance, they showed that Nf < 2"“^ — 
2 j”-i _ 2™ when n is even and m < — 1, and Nf < 2”“^ — 25”“^ — 2™+^ 

when / is balanced, n is even and m < — 1. It is not clear whether these 

bounds are tight. 
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Abstract. In this paper, we propose a novel relationship between the correlation of two 
polynomial-type Boolean functions and the order of an associated algebraic curve. By this 
relationship, we propose a method to generate a resilient (correlation immune and balanced) 
function from a cubic polynomial. Since our resilient function is derived from a polynomial 
over a finite field, its nonlinearity is much easier to control. Moreover we can construct a 
resilient function with multi-bit outputs. We present several examples of a resilient function 
with 2 outputs. 



1 Introduction 

A correlation immune function has been an active area since T. Siegentalar firstly introduced its 
concept [10], because most stream ciphers with nonlinear filter function or nonlinear combination 
of LFSR’s are vulnerable to correlation attack[8j. P. Camion et al\l] presented a method for con- 
struction balanced correlation immune function. J. Sebbery et at [9] discussed the nonlinearity 
and propagation characteristic of such functions. Later, S. Chee et al.[3] verified the relationship 
between correlation immunity and nonlinearity. However, since all of this methods construct each 
Boolean function independently, it seems to be very difficult to construct a correlation immune 
vector Boolean function. Recently, a vector Boolean function which is bent [5] or which has small 
correlation[12] was introduced, but none of which provides a result on correlation immune func- 
tions. 

In this paper, we propose a novel relationship between the correlation of two polynomial-type 
Boolean functions and the order of an associated algebraic curve. As a result, we show that several 
component functions of a cubic polynomial are bent (or semi-bent) for even (or odd, resp.) n. 
Also, we analyze correlation between a cubic function and affine functions and propose a method 
to generate a resilient (correlation immune and balanced) function from a cubic polynomial. While 
every previous method generates a resilient function systematically, we firstly derive a resilient 
function from a polynomial over a finite field. Hence we can easily construct a resilient function 
with multi-bit outputs and its nonlinearity is much easier to control. Our vector resilient functions 
can be used in designing a stream cipher with several output bits. 

In section 2, we gives basic dehnitions on cryptographic properties of Boolean functions. In 
section 3, we prove the main theorem. In section 4, we analyze correlation properties of cubic 
polynomials over F 2 « . In section 5, we propose a new method to generate a resilient function and 
gives several examples. In section 6, we define a vector resilient function and propose a method to 
generate it. Also, we present several examples of a resilient function with two outputs. In section 
7. we conclude this paper. 

2 Basic Definitions 

Let Z 2 be the hnite field with two elements and the n-dimensional vector space over the field 
Z 2 . A Boolean function on ZJ is a function whose input is binary n-tuples x = {x\,X 2 , • • • , Xn) 
and takes the values 0 and 1. The set of all Boolean functions defined over Z 2 is denoted by In 
some cases it will be more convenient to work with the function (— 1)^^“’) = 1 — 2f{x) that takes 
the value {1, —1}. 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 64-72, 2001. 
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For a = (ai, 02 , • • • , an), b — {bi,b 2 , ■ • • , bn) in Z 2 , a-b — aibi 0 • • • 0 Unbn is the inner product 
of two vectors. The Hamming weight of an element a is the number of components equal to 1 and 
denoted by wt(a). The Hamming weight of a function / is the number of function values equal to 
1. The distance d{f,g) between two functions / and g is the number of function values in which 
they differ: 

g) = wt(/ ®g)= #{x\f{x) g{x)}. 

A function is said to be linear if there is w G 1,2 such that it can be written as lw{x) = w ■ x = 
aiXi 0 02^2 0 • • • 0 anXn- The set of all linear functions on Z 2 is denoted by A function / is 
said to be affine if there is ro G ZJ and c G Z 2 such that f{x) = lw{x) 0 c, and the set of all affine 
function on Z 2 is denoted by 

Now we introduce some basic definitions on properties of Boolean functions. 

Definition 1. f G Bn is balanced z/#{a; G Z 2 |/(a:) = 0} = #{a: G Z 2 |/(a:) = 1}. 



Definition 2. The correlation value between the function f and g in Bn is defined by 

d{f,g) 



c{f,g) = 1 - 



>n— 1 



If c{f, Iw) = 0 for all w G Ztf with 1 < wt(tc) < k, we say that f is k-th order correlation immune. 
If a balanced function is k-th order correlation immune, it is called a k-th order resilient function. 



From the definition, we know that —1 < c{f,g) < 1. In particular, c{f,f) = 1, c(f,f 0 1) = — 1 
and c(/, g) = 0 if and only if d(/, g) = 2"“^. 

Definition 3. For any Boolean function f over Zlf, we define the Walsh- Hadamard transforma- 
tion / : Z 2 — > IR as follows: 



./(w) = /(x) • (-ir-, tc G Z^. 

xezj 



When we apply Walsh-Hadamard transformation to Xf = we have 



Xf{w) = 2^c{f,w-x), 



( 1 ) 



which has the following properties. 

Lemma 1. Let f G Bn and U = v ■ x. 

1. f is balanced if and only if J^(0) = 0. 

2. f is a k-th order correlation immune function if and only if Xf{w) = 0 for all w G Hlf with 
wt(w) < k. 

3- Xf+i{w) = Xf+iAw) = ffifiw + v). 

Now we introduce a bent and semi-bent function. A bent function is a Boolean function whose 
correlation value is unique up to sign, and a semi-bent function is a balanced Boolean function 
whose correlation value is at most 2 up to sign. The exact definition is as follows: 

Definition 4. Let f G Bn. 

1. f is called a bent function if for any w G 

lx?(w)| = 2 t. 

2. f is called a semi-bent functionf 2] if xf{0) = 0, and |!^(w)| = 0 or 2LtJ+i^ where [mj is the 
greatest integer not greater than m. 

Observe that a bent function exists only when n is even, and / is a bent function if and only if 
c(/, Iw) = 2“”/^ for any w G ZJ. 

Consider a vector Boolean function T : Z 2 — > Z™. Let b — {bi,b 2 , ■ ■ ■ , bm) G Z™ be a nonzero 
element. We denote hy b ■ F the Boolean function on Z 2 , which is the linear combination bifi 0 
^ 2/2 0 • • • 0 bnfn of the components functions /i, / 2 , • • • , fm oi F on Z 2 . 
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3 Main Theorem 



Let a be an element of a finite field F2« , F a polynomial over F2» , B = {f3i, ■ ■ ■ , /?„} a basis of F2« 
over F2. Then we can embed all elements of F2» to That is, we have a natural isomorphism 
depending on a basis B of F2» : 

n 

: F2» ^ ^2, {ai,a2,- ■ ■ ,an), G Z 2 . 

i=l 

Throughout this paper, we consider an element of F2« as an element of ZJ, and vice versa without 
notifying, unless confused. 

Now we can define a Boolean function a ■ F as follows: 



a • F : Z2 — *■ Z2, a; 1-^ a • F{x), 

where a ■ F{x) is the inner product of two binary vectors, one of which is obtained by expressing 
a by the basis B and another is obtained by expressing F{x) by its dual basis B. Note that a ■ F 
does not depend on the choice of a basis B since a ■ F{x) = Tr[aF(a;)] [4]. 

Throughout this paper, we use the following notation: 



Tr[-] — Trp^n/Fsl']) Te[-] — 

Before presenting a main theorem, we introdnce an useful fact to prove the main theorem. 
x'^ + ax + b = 0 has a root in F2« if and only if Tr[b/a^] = 0. 



( 2 ) 



Theorem 1. Let F{x),G{x) be polynomials over , and a,b G F 2 « . Consider an algebraic curve 
C : + y = aF{x) + bG{x) over F 2 « . Then we have 

c{a-F,b-G)^*^^^-l, 



where #C'(F 2 ») is the number of ¥ 2 n -rational points of the curve G in the affine plane. 

Proof. Take a basis B of F2« over F2 . If we represent a, b and F{x),G{x) by the basis B and its 
dnal basis B respectively, we have 

a ■ F{x) = Tr[aF(a:)], b ■ G{x) = Tr[bG{x)]. 



Hence we have 



d{a -F,b-G) = ff{x G Z^|Tr[aF(a;)] ^ Tr[bG{x)]} 

= #{x G Z^|Tr[aF(a;) + bG{x)] ^ 0}. 

On the other hand, since it has no multiple root, the equation of y, + y = aF{x) + bG{x), 
has two roots if and only if Tr[aF{x) + bG{x)] = 0 by (2). Hence we have 

#C(F2n) = 2(2 ”-d(a-T, 6 -G)) 



so 



c{a ■ F,b ■ G) = 1 — 



d{a -F,b-G) 

2n-l 




Using Theorem 1, we can derive easily the following corollary. 



Corollary 1. Let F{x),G{x) be polynomials over I^" and a,b G F 2 - . Then we have 

1. c{a ■ F,b- G) =0 if and only if an algebraic curve + y = aF{x) + bG{x) has 2” F2- -rational 
points on the affine plane. 

2. F is k-th order correlation immune if and only if an algebraic curve ff + y = aF{x) + wx has 
2" F 2 >* -rational points on the affine plane for each w £ WIf with 1 < wt{w) < k. (Note that the 
imbedding of w to F 2 « changes by a basis.) 



By the above corollary, if we hnd some family of algebraic curves which has 2" F2»» -rational points, 
we may obtain two functions without correlation. 
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4 Correlation Properties of Cubic Polynomials 

Let F{x) be a monic polynomial of degree 3 over F2» and a C F2« . In this section, we investigate 
correlation properties of a Boolean function a ■ F. By Theorem 1, the correlation of a ■ F and C 
is determined by the order of an elliptic curve + y = aF{x) + wx. In fact, a ■ F is k-th order 
correlation immune if and only if y^ + y = aF{x) + wx has order 2" + 1 (inclnding the infinity 
point) over F2»» for all w € Z 2 with 1 < wt(ic) < k. 

Now we introduce a lemma which is useful to prove the results in this section. 

Lemma 2. [6] Consider a quartic equation 

x'^ + ax + b = 0, a, 6eF2-, a ^ 0. (3) 

— If n is odd, then (3) has either no solution or exaetly two solutions. 

— If n is even and a is not a cube, then (3) has exactly one solution. 

— If n is even and a is a cube, then (3) has four solutions if Te[b/df^^] = 0, and no solutions if 
Te[b/a^^^] + 0 . 



4.1 The case of odd n 



First of all, we introduce a lemma for orders of supersingular elliptic curves over F 2 n for odd n. 

Lemma 3. [6] For odd n, any supersingular elliptic curve over 15" is isomorphic to one of fol- 
lowing three curves. 

— y^ + y = whose order is 2” + 1. 

“ y^ y = x^ X whose order is 2” + 1 ± 

“ y^ + y = x^-\-x-\-l whose order is 2" + 1 ± 

Observe that if n is odd, x^ — a has a root in F2" for any a G F2« because gcd(2" — 1,3) = 1. 

Theorem 2. Let n be an odd integer, a,v G F 2 " , and F{x) = x^ vx G F 2 "[x]. Assume that 
x,w G are embedded into F2" by a basis and its dual basis, respectively. If we let is a root 
of x^ — a in F2« , we have 



Xa.F{w) 



0 if wj \fd = s^ + s for some s G F 2 " 

n + 1 

±2 2 otherwise 



Proof. Consider an elliptic curve E ■. y^ -\- y = a{x^ + vx) + wx. Note that E is isomorphic to 
El : y"^ yx = x^ { °-^^ )x by a linear transformation (x,y) 1— > {f/ax,y). Moreover Ei is 
isomorphic to y"^ -\- y = x^ if and only if the following two equations are solvable simultaneously 
in F2" [6] . 



.4 + , + (<"’ + “■) , 0 , 



+ t + s^ = 



( 4 ) 

( 5 ) 



Observe that (5) is equivalent to Tr[s®] = 0 by (2), and so Tr[s^] = 0. If sq is a root of (4), 
then So + 1 is also a root of (4). If (4) has a root one of roots of (4) satisfies Tr[s^] = 0 since 
Tr[{so + 1)^] = 7"r[so] + 1. Hence we see that Ei is isomorphic to y"^ -Gy = x^ if and only if (4) has 
a root in F2" . In this case, we have c(a ■ E,l^) = 0 hy Theorem 1 because y"^ -\- y = x^ has order 
2 ” + 1 . 

On the other hand, if (4) has no root in F2" , then Ei should be isomorphic to either + 
y = x^ -\- X or y"^ -\- y = x^ -\- X. In any cases, E\ has order 2” + 1 ± Hence we have 

c(a • F, Iw) = ±2^^“"^/^ by Theorem 1. 

If we use c(a ■ F, 1^,) = x^{w)/2^, we obtain the theorem. 



Note that a • F is a semi-bent function for any basis B of F2" . From Lemma 2, we know that 
-I- s -I- w/ ffd has exactly 0 or 2 different roots. Hence given a G F 2 « , the number of w such that 
-I- s -I- w/ f/a has a root in F2" is exactly 2”/2 = 2"“^. Hence a ■ F has the correlation value 0 
for a half of all w G ’^ 2 - 
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4.2 The case of even n 

We have the following lemma for orders of supersingular elliptic curves over F 2 « for even n. 

Lemma 4. [6] Let a, ^,6 G F 2 « , 7 ^ Ffn , and Te[6] ^ 0. For even n, any supersingular elliptic 
curve over F 2 *» is isomorphic to one of following three curves. 

— + y = + 6x whose order is 2 ” + 1 . 

“ y^ + y = x^ + a whose order is 2 ” + 1 ± 2 "/^+^. 

— y'^ + jy = x^ + a whose order is 2 " + 1 ± 2 "/^. 

Theorem 3. Let n be an even integer, a,v G F 2 « , and F{x) = x^ + vx G F 2 "[a;]. Assume that 
x,w G Z 2 are embedded into F 2 *t by a basis and its dual basis, respectively. 

1. Lf x^ — a has a root, say f/d, in F 2 « , then we have 

tfTe[^]^0 
tfTe[i^] = 0' 

2. If a ^ Ffn , then we have 

Xfffiw) = ±2® for all w. 

Proof. Let ^ be a root of — a in F 2 « . Consider an elliptic curve E ■. y^ + y = a{x^ + vx) + wx 
over F 2 " . Observe that E is isomorphic to : y^ + y = x^ + linear transformation 

{x,y) I— > {f/ax,y). If Te[ '’°'^ ] ^ 0, E\ has order 2” + 1 by Lemma 4, so c(a • E,l.^)) = 0. If 
Te ["°'^ ] = 0, then s^ + s + has a root, say sq, in F 2 - by (3). By the linear transformation 
{x,y) I— > {x + .s^ ,y + sx), we have that Ex is isomorphic to y"^ + y = x^ + Sq which has order 
2” + 1 ± 2"/2 +i, so c(a • F, L) = ±2^-^/"^. 

On the other hand, if a ^ , then E is isomorphic to y'^ + jy = x^ + a for some nonzero 

a, 7 G F 2 « which has order 2 " + 1 ± 2 "/^, so c(a ■ F,lw) = ± 2 “"/^. 

If we use c(a • F, 1^,) = xffF{w)/2^, we obtain the theorem. 

Observe that a • F is a bent function for a ^ , regardless of a basis B of F 2 " . In the case of 

a G IF^n , a - F is balanced if and only if Te[v-^ ] 0 since Te[w/^/a] = 0 for re = 0. 

5 Resilient Functions 

In this section, we propose a method to generate a resilient function and give several examples. 

Theorem 4. Let n be an even integer, b,v G F 2 « , and F„{x) = x^ + vx G F 2 « [x\. Assume that 
x,w Ghtf are embedded into F 2 *» by a basis and its dual basis, respectively. Then Tr[ 6 “^F„(a;)] is 
a 1-st order resilient function if and only ifTe[vb~‘^\ 0 and Te[wb] y^ Te[vb~‘^] for all elements 
w in the dual basis. 

Proof. Using Theorem 3, we have that 

Xb^^3^{w) =0 if and only if Te[vb~'^ + wb] y^ 0. 

Hence we have by Lemma 1 that Tr[ 6 “^F(a;)] is 1-st order resilient if and only if Te[rca] y^ Te[vb~‘^]. 
Note that x,w G are embedded into F 2 « by a basis and its dual basis, respectively. Since 
wt(w) < 1 implies that ic = 0 or tc is a basis element of the dual basis, we have the theorem. 

Using this, we can generate resilient functions easily. See the following corollary. 

Corollary 2. Let fy{x) = Tr[x^ + vx]. Assume thatx,w G ^2 are embedded into F 2 « by a normal 
basis and its dual basis, respectively. Then f, is a 1-st order resilient function for every v with 
Te[v] = 1, the number of which is 2”“^. 



Xa.F{w) = 
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Proof. Take a normal basis B = {9,9^, - ■ ■ , 0^" of F2« over F2 . If we represent x = 
{xi € F2) we have 



n— 1 

Te[x] = 



i=0 



where Xi = 



Xi+X3-\ h Xn -1 

X2 “t“ X4 “!-•••+ Xn 



for even i 
for odd i 



( 6 ) 



since Te[a;] = x + x'^ + x'^^ + • • • + ^ and x^ = Xi-i9'^' (let X-\ = Xn-i). Note that a 

dual basis of a normal basis is also a normal basis. Since 1 = 9^ , we have that Te[w] = 1 if 

and only if each sum of the odd and even components of tc is 1. Hence any vector with Hamming 
weight 0 or 1 does not satisfy Te[w] = 1, so Tr\l * {x^ + vx)] is 1-st order resilient for every v with 
Te\v] = 1. The last statement follows from the fact that a fourth of all elements of F2*t satisfies 
Te[u] = 1. 

Using Theorem 4, we can generate a resilient function as the following procedure. 



Procedure 1: Generate a Resilient Function 

1. Fix a basis B of F 2 » . 

2. Compute a dual basis B of B. 

3. Calculate a set S 

S = {b G F2>» \Te[wb] ^ 1 for all w in B}. 

4. Find Vb for each b G S snch that Te[w{,6“^] = 1, which covers a fonrth of all elements 
of F 2 " . 

5. Compute the function Tr[b~^{x^ + Vbx)], which is a 1-st order resilient function. 



Note that the value of trace can be replaced by another nonzero element in F22 . 

When we use Theorem 4, we need to compute a dual basis of given basis. Now we introduce a 
lemma to compute a dual basis of given polynomial basis. 

Lemma 5. [7] Let B = {l,a, - ■ ■ , be a polynomial basis o/F2« over F2 and let f{x) be the 

minimum polynomial of a over F2« . Let f{x) = {x — a){(3o + Pix -I- • • • -f (di, aF2« . 

Then the dual basis of B is B = {70, 71, • ' ' > 7n-i} where 



nxY 



i = 0 , 1) • • • ,n — l. 



Throughout this paper, we express an element of the finite field as a Boolean string. That is, 
we denotes YYaYq oaoti, ai G F2 by 

n—1 

= a„_i • • • 020100 

i=0 

where {oo, ai, • ' ' : otn-i} is a basis of F2- over F2 . Also we denote a Boolean function f{x) by its 
value on F2« , i.e. 

2"-l 

^ fm = /(2" - l)/(2" - l)..../(2)/(l)/(0) 

i=0 

where we use a Boolean string notation for elements of the finite field. When we represent such 
binary string as a hexa-decimal expression, we denote by small x’s. For example, 6Ux means 0110 
1100 as a binary string. 

Using the above procedure and Lemma 5, we generated several resilient functions as the fol- 
lowing examples. 
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Example 1 . Let n = 4. Let t be a root of an irreducible polynomial + x + 1 over F 2 . Take a 
basis B — Then we can compute its dual basis B = {t^ + 1} by Lemma 5. 

Using Procedure 1, we can get 4 resilient functions. 





'^x 


f{x) — Tr[b~^{x^ + vx)] 


7. 


Clx 


665qx 


dx 


lx 


lde2x 




lx 


369cx 


fx 


4x 


le78x 



Example 2. Let n = 6. Let t be a root of an irreducible polynomial x^ + x + 1 over F 2 . Take a 
basis B = Then we can compute its dual basis B = {t^ + 1, 1} by 

Lemma 5. Using Procedure 1, we can get 11 resilient functions. 



b 


V 


f{x) = Tr[b + 


17 X 


1. 


2e68 741d 68dl ld86. 


19. 


1. 


16e4 d827 827d 6e41. 


Ifx 


4. 


Oc/c 3/30 5659 9o6a. 


27. 


1. 


2671 e862 d48e 174d. 


2d. 


1. 


0/96 o5c3 96/0 3co5. 


ZjOx 


1. 


2e74 1267 ld68 de84. 


32. 


4. 


1286 ed74 67dl 482e. 


00 


2. 


3639 6c63 36c6 6c9c. 


36. 


20. 


5665 596a 65a9 6aa6. 


3cx 


2. 


Ie66 4611 44el ee64. 


3 /. 


3. 


2471 6dl7 e8bd 8e24. 



6 Vector Resilient Functions 

A function F : ZJ — > is called a vector Boolean function. Note that there are unique Boolean 

function fi such that F = {fi, f 2 ,- " ) /m)- Any linear combination of ffs are called a component 
function of F. 

Definition 5. A vector Boolean function is said to he k-th order resilient if and only if every 
component function is k-th order resilient. A vector Boolean function which is k-th order resilient 
is called a k-th order vector resilient function. 

Theorem 5. Let n be an even integer and Fy{x) = -\- vx G F 2 « [x] for b,v G F 2 « . Assume that 

x,w G Z 2 are embedded into F 2 *» by a basis and its dual basis, respectively. For e G 1^2 , let 

Se = {h G F 2 " \Te[wb] ^ e for all w in B}. (7) 

If any linear combination ofbf^, bf^, • • • , bf^ is equal to b~^ for some b G S and Te[vb~‘^] = e for 
such b, then (/i, / 2 , ■ ■ ■ , fr) is a 1-st order vector resilient function where f = Te[b~^Fy{x)]. 

Proof. Let / = Y^^^^aifi, at G F 2 be a linear combination of the component functions of 
(/i) /2, • • • , fr)- Since Te[-] is a homomorphism, we have f{x) = Te[{fY\=i K^)^'»i^)]- assump- 
tion, we know that any linear combination of bf^, bf^, • • • , b~^ is equal to b~^ for some b G Se. 
Hence we have f(x) = Te[b~^Fy{x)]. Since Te[vb~'^] = e by assumption, we have Te[vb~'^] ^ 0 
and Te\wb] ^ Te[vb~‘^] for all element w in the dual basis of the basis, which completes the proof. 

Using Theorem 4, we can generate a vector resilient function as the following procedure. 

Procedure 2: Generate a Resilient ETinction with Two Outputs 
1. Fix a basis B of F 2 »i . 
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2 . Compute a dual basis B of B. 

3 . Calculate a set S 



S = {h & F2*« \Te[wh] ^ 1 for all w in B}. 

4 . Calculate a set S2 = {(61, &2, ~ G 5’}- 

5 . Find Vh for each (61,62,^3) C S2 snch that Te[u{,6“^] = 1. 

6. Compute the function fi = Tr[b~^{x^ + v^x)] for i = 1,2,3. Then (/i,/2) is a 1 -st 
resilient fnnction with two ontpnts. 



Example 3. Let n = 6. Let t be a root of an irreducible polynomial a;® -I- a; -I- 1 over F2 . Take a 
basis B = {1, t, t®, t®}. Then its dual basis is B = {t® -I- 1, t, 1} by Lemma 5. Using 

Procedure 2, we can get 2 vector resilient functions with two outputs. 

61 — 17a;, ^2 — 36a;, 63 — 3Ca;, V — 3Ca; 

• /i(a;) = 74e2 2e47 e28b A7dl^ 

• 'f^lx) = 6aa6 65o9 596a 5665a; 

• = le44 46ee 66el 1164a; 

61 — 2Ca;, 62 — 37a;, 63 — ^fxj ^ — 2Ca; 

• fi{x) = 2e74, ed48, ld68, 2176a, 

• f 2 {x) = 6393, 39c9, 9c93, c6c9a, 

• f^lx) = Mel, d481, 8126, elb2^ 

Example 4- Let n = 8. Let t be a root of an irreducible polynomial a;® -I- a;® -|- a;® -|- a; -1-1 over F2 . Take 
a basis B — {1, - ■ ■ , t^}. Then its dual basis is B = {5da,, bax, Mx, c9a,, fix, 81a,, 61a,, 9/a,} by 

Lemma 5, we represent the elements of B by Boolean expression. Using Procedure 2, we can get 
14 vector resilient functions with two outputs. We give three of them. 

61 = 3dx, 62 — Idx, 63 = fix, X = 22a; 

• /i(a;) = 69cc 3396 9633 cc69 5a00 ffa5 abff 005a aafO Of 55 aafO Of 55 66c3 3c99 66c3 
3c99a; 

• f 2 {x) = 3/56 9503 c056 95 fc cfa6 9a0c 30a6 9a/3 c/a6 65/3 30a6 650c c0a9 9503 3/a9 
95fcx 

• /3(a;) = 569a a695 5665 5995 95a6 65a9 9559 9aa9 6556 6aa6 9a56 6a59 a66a a99a 596a 
a965a; 

- 61 = 5fx, 62 = fix, 63 = ffx, V = 83a; 

• f^ {x) = 0c3/ c/03 /c30 3/Oc cO/3 /c30 30/c 0c3/ 9559 5665 9aa9 5995 a66a 9aa9 a99a 
9559a; 

• f 2 {x) = 36a0 c95f /59c 0a63 5/36 a0c9 63/5 9c0a 3950 c6a/ 0593 fade 50c6 a/39 93/a 
6c05a; 

• /3(a;) = 3a9/ 065c 09ac 356/ 9/c5 5c f 9 5309 9035 ac09 90ca 9/3a a3/9 fOac 3590 3a60 
/95c, 

- 61 = 65a;, 62 = 6/a;, 63 = bCx, v= Cx 

• f^{x) = 46d2 2d46 d264 642d lc87 87el 781e el78 642d d264 d264 642d lc87 87el 87el 
lc87, 

• f 2 {x) = 5acc 99 fO 33a5 0/66 a5cc 66/0 335a 0/99 55c3 96// 3caa 0069 553c 9600 c3aa 
//69a; 

• fsix) = file bAbb elll bbAb bbAb elll 4544 eeel elec 4445 eele 5444 4555 llel 4445 elec a. 

Example 5. Let n = 10. Let t be a root of an irreducible polynomial a:^® -I- a;® -I- 1 over F2. Take 
a basis B = {1, t, • • • , t’^}. By similar procedure, we can get 33 vector Boolean functions with 
two outputs. 
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7 Conclusion 

In this paper, we proposed a method to generate a vector resilient function with multi-output. Our 
approach is to derive a resilient function from a polynomial over a finite field, whose properties 
is closely related to the associated algebraic curves. In this paper, we analyze the properties of a 
cubic polynomial and associated elliptic curves to derive a 1-st order resilient function with multi- 
output. We expect this method may be generalized to higher degree polynomial and algebraic 
curves to get higher degree resiliency. 
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Abstract. We describe a universal hash-function family, PolyR, which 
hashes messages of effectively arbitrary lengths in 3. 9-6. 9 cycles/byte 
(cpb) on a Pentium II (achieving a collision probability in the range 
2-16_2-50) jjiost proposals, PolyR actually hashes short messages 

faster (per byte) than long ones. At the same time, its key is only a few 
bytes, the output is only a few bytes, and no “preprocessing” is needed 
to achieve maximal efficiency. Our designs have been strongly influenced 
by low-level considerations relevant to software speed, and experimental 
results are given throughout. 



Keywords: Universal hashing, software-optimized hashing, message au- 
thentication, UMAC. 



1 Introduction 

Ever since its introduction by Carter and Wegman in 1979 [6], universal hash- 
ing has been an important tool in computer science. Recent attention has been 
paid to universal hashing as a method to authenticate messages, an idea also 
proposed by these authors [12]. Its use in authentication has resulted in sev- 
eral very fast universal hash functions with low collision probabilities. But the 
implementations of these fastest universal hash functions tend to require either 
significant precomputed data, long keys or special-purpose hardware to achieve 
their impressive speeds. 

Our contribution is a polynomial-based hash function we call PolyR. This 
hash function is not as fast as the fastest hash functions which have been designed 
for message authentication — speed is about 3. 9-6. 9 cpb. But that is still very 
fast, and, compared to the fastest of hash functions, PolyR has some different and 
desirable characteristics. First, it hashes messages of essentially any length (and 
varying lengths are fine). The key is short (say 28 bytes), independent of the 
message length. The key requires no preprocessing: the natural representation of 
the key is the desirable one for achieving good efficiency. Quite pleasantly, the 
hash function is fastest, per byte, on short messages — it actually gets slower, 
per byte, as the message gets longer (the rates are constant until particular 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 73-89, 2001. 
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Hash Function 


Collision Bound 


Code -1- Data Size 


Speed (cpb) 


Ontpnt (bits) 


This paper 




(124 -1- 8) bytes 


3.9 


32 


This paper 




(409 + 16) bytes 


6.9 


64 


Division Hash [10] 


n2-^« 


~ (? -t 8) KB 


7.5 


64 


UHASH-16 [3] 


2“'’“ 


~ (7 + 2) KB 


1.0 


64 


UHASH-32 [3] 




~ (8 -b 2) KB 


2.0 


64 


hashl27 [2] 




~ (4-b 1.5) KB 


4.3 


127 


MD5 


nnknown 


1.7 KB 


5.3 


128 


SHA-1 


nnknown 


4.3 KB 


13.1 


160 



Table 1. Comparing the new constructions with some other hash families. Sizes marked 
with are conservative estimates. All timings are for the fastest Pentium/Pentium II 
timings reported. To obtain smaller collision bounds one can hash twice or use the 
methods of this paper with p(96) or p(128). 



threshold lengths are crossed, like 2^^ and 2^^ bytes). This is the exact reverse of 
most optimized hash functions having short output lengths: they do better as the 
message gets longer. If used for authentication, working best for short messages 
is desirable insofar as mo.st network traffic is short. Finally, implementation of 
our hash function family is simple and requires no special hardware (like floating- 
point units or multimedia execution units) to do well. 

The hash function family PolyR was designed for use in a multi-layer hashing 
construction, to be used for fast message authentication. In such constructions a 
very fast first layer of hashing is applied to an incoming message to compress it to 
a small fraction of its original length. This compressed message is then passed to 
PolyR. When used as a second hash-layer in this manner, it can be expected that 
the vast majority of messages fed to PolyR will be short, since messages must be 
quite huge indeed before the second-layer compressed message gets long. 

The hash function PolyR is a refinement to the classical suggestion of Carter 
and Wegman where one treats the message as specifying the coefficients of a 
polynomial, and one evaluates that polynomial at a point which is the key. 
Our refinements involve: (1) choosing the base held to be a prime just smaller 
than a power of 2^^, 2^^, or 2^^® (this is a common trick); (2) using a simple 
“translation” trick to take care of the problem that some messages will now give 
rise to coefficients not in the held (because our held is just smaller than a power 
of two); (3) limiting the key space to a particular “convenient” subset of all the 
held points; and (4) using a “ramping-up” trick so that we don’t have to pay in 
efficiency for short messages in order that the method can handle long ones. The 
result is a simple, flexible, fast-to-compute hash function. These various tricks, 
individually rather modest, work together to rise to a quite a nice hash-function 
family. 
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1.1 Related Work 

Carter and Wegman introduced the ideas of universal hashing and using polyno- 
mials in universal hashing in 1979 [6]. Since that time polynomials have been used 
for fast hashing in many other works. In his “Cryptographic CRC”, Krawczyk 
views messages to be hashed as polynomials over GF(2) which are divided mod- 
ulo a random irreducible polynomial [8]. The division can be done quickly in 
hardware using linear feedback shift registers. Shoup describes several variants 
of polynomial hashing and provides implementation results [10]. His “general- 
ized division hash” , which bounds collisions between 64n-bit messages as no more 
than views messages as polynomials over GF(2®^), uses an 8 KB precom- 

puted table, and has a throughput of 7.5 cpb on a Pentium [10]. Afanassiev, 
Gehrmann and Smeets discuss fast polynomial hashing modulo random 3-, 4- 
and 5-nomials [1] . Their methods use small keys, but no implementation results 
are provided. Bernstein defines hashl27, a polynomial hash defined over a large 
prime field, Zp(i 27 ) [2]. Over 32n-bit messages it has a collision probability of 
no more than Bernstein’s implementation uses floating-point operations 

and a 1.5 KB precomputed table to achieve a throughput of 4.3 cpb on a Pen- 
tium II. 

Other software efficient universal hash functions include Rogaway’s bucket 
hash [9]; the MMH function of Halevi and Krawczyk [7]; and the NH function of 
Black, Halevi, Krawczyk, Krovetz and Rogaway [4]. The last is the current speed 
champion, providing collision probabilities of 2“®*^ with 4 KB of precomputed 
data and achieving throughput of 1.0 cpb on a Pentium II. using Pentium MMX 
instructions, and 1.9 cpb without MMX. 

If one does not require the combinatoric certainties of universal hashing, 
one could employ cryptographic hashing to construct hash functions with short 
output lengths, short keys and little preprocessing. Bosselaers, Govaerts and 
Vandewalle report on optimized Pentium timing for several cryptographic hash 
functions: MD4 (3.8 cpb), MD5 (5.3 cpb) and SHA-1 (13.1 cpb) [5]. Simple 
methods can be used to convert these function into universal hash functions by, 
for example, keying their initial values [11]. We do not know what the collision 
probability would be for such constructions; for such a transformation to result 
in a good universal hash function, certain unproved assumptions must be made 
about the cryptographic hash function. 



1.2 Notation 

The algorithms described in this paper manipulate both bit-strings and integers. 
The t-th bit of string M is denoted M[i] (bit-indices begin with 1). The substring 
consisting of the i-th through j-th bits of M is denoted M\i . . .f\. The concate- 
nation of string Mi followed by string M 2 is denoted Mi || M 2 . The length in 
bits of string M is |M|. The string of n zero-bits is denoted 0”. 

Given & > 1, the constant p{b) is the largest prime smaller than 2®. Given 
string M and b > 0, padonezero(M, 6) returns the string M || 1 || 0", where n is 
the smallest number that makes the length of M || 1 || 0” divisible by b. Given 
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algorithm PolyCW [F] (fc, m) 

// Parameter: F is a finite field. 

// Input: fc e F and m = (mo, . . . , nin) where m: £ F for 0 < i < n. 
// Output: y e F. 

Let n be the number of elements in m 
y = 0 

for i ^ 0 to n do 

y ^ ky + mi // Arithmetic in F 

return y 



Fig. 1. The basic polynomial- hashing method of Carter and Wegman on which we 
build. The message m = (mo, . . . , m„) is hashed to 



a string M, the function str2num(M) returns the integer that results when M 
is interpreted as an unsigned binary number. Similarly, num2str(n, 6) produces 
the unique 6-bit string which is the binary representation for the non-negative 
number n. 

The number of elements in a set S is denoted |S'|. 



1.3 Organization 

In the next few sections we develop a fast polynomial hash function. We build up 
to it in a couple of stages. In the appendix we generalize the hash function using 
arbitrary parameters. Theorems are given in both cases, but proven only for the 
concrete case. Proofs for the parameterized cases are straightforward adaptations 
of the ones for the concrete version, so they are omitted. Understanding the 
algorithms, theorems and proofs is easier in the concrete examples. 



2 Carter- Wegman Polynomial Hashing: PolyCW 

We begin by reviewing the “standard” approach for polynomial hashing. Let F 
be a finite field, let A: £ F be a point in that field (the “key”) and let m = 
(too, . . . , TO„) be a vector of points in F that we want to hash (the “message”). 
We can hash message m to a point y in F (the “hash value”) by computing 

y = TOofc”-| \-rrin-ik^+mnk'^ , where all arithmetic is done in F. We denote this 

family of hash functions as PolyCW [F]. The computation of this hash function 
(with n-\-l multiplications in the field and n-|-l additions in the field) is described 
in Figure 1.^ 

^ All algorithms depicted in this paper which evaluate polynomials do so by using 
Horner’s Rule which says that polynomial rngk" -|- • • • -|- -I- mnk^ can be 

rewritten as m„-|-fc(m„_i-|-fc(m„_2-|-fc(TO„_3-|-- • •))). This allows for simple iteration 
with one mnltiplication and one addition for each element of the message. 
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algorithm PolyP32(fc,m) 

// Input: k € K 32 and m = (mi, . . . , m„) where nii G 'Zp( 32 ) for 1 < i < n. 

// Output: y G Zp( 32 ). 

Let n be the number of elements in m 

p <— 2®^ — 5 // The largest prime smaller than 2®^ 

y^O 

for i ^ 1 to n do 

y ^ ky + mi mod p 

return y 



Fig. 2. The PolyP32 algorithm. A variant of the PolyCW hash, accelerated by choosing 
a held Zp( 32 ) in which calculations can be performed quickly and choosing a keyset 
K32 which reduces arithmetic overflow on 32-bit processors. The for loop could be 
rewritten as the polynomial: y = X^"^i (mifc"“*) mod p. 



PolyCW [F] is one of the most well-known universal hash-function families. 
It was described by Carter and Wegman in the paper that introduced that 
notion [6]. The main property it has is as follows. If m = (m„,...,mo) and 
m' = . . . ,mg) are distinct vectors with the same number of components 

then Ft[H e- PolyCW [F] ; /c A F : Wfe(m) = iLfc(m')] < -p. This result is due 
to the Fundamental Theorem of Algebra which states that a nonzero polynomial 
of degree at most n can have at most n roots. Rewriting the above probability as 
Pr[fc ^ F : ^”^0 = X)”=o = Pr[fc A F : = 

0] , and applying the Fundamental Theorem, we see that there can be at most n 
values for k which cause ~ to evaluate to zero. 

3 Making PolyCW [F] Fast 

Care must be taken in the implementation of PolyCW [F]. A naive implementa- 
tion is unlikely to perform well. Many choices of F and the set from which the 
hash- key is chosen can result in sub-optimal performance. We investigate the 
effect that shrewd choices for F and the key-set have on performance. 

Field Selection. To make an efficient and practical hash function out of 
PolyCW [F] we should carefully choose the finite field F. Fields like GF[2®"‘] make 
natural candidates, because we are ultimately interested in hashing bit strings 
which are easily partitioned into 64-bit substrings. But arithmetic in GF[2’"] 
turns out to be less convenient for contemporary CPUs than a well-chosen alter- 
native. In this paper we will do better by using prime fields in which the prime 
is just smaller than a power of two. 

Consider first the use of the prime p(32) = 2^^ — 5, which is the largest 
prime less than 2^^. To implement PolyCW [Zp( 32 )] efficiently, we need a good 
way to calculate y ^ ky mmodp(32), where y,k,m G Zp( 32 ). There are 
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several options. One’s first instinct is to use the native “mod” operand of a high- 
level programming language (like “7,” in C), or to use a corresponding operator 
in the hardware architecture. But these choices are usually slow. For example, 
PolyCW [Zp( 32 )], implemented in assembly using the native mod operator runs 
in 12.4 cpb (cycles/byte) on a Pentium II. 

A faster method exploits the fact that since p(32) = 2^^ — 5, the numbers 2^^ 
and 5 are equivalent in the field hp( 32 ), so 2^^ = 10, 2^'^ = 20 and, more generally, 
a2^^ = 5a in Zp( 32 ). So, to calculate ky mod p{32), first compute the 64-bit 
product z = ky and separate 2 into a 32-bit high-word a and a 32-bit low-word 
b so that z = a2^^ -I- b. We can then use the observation just made and rewrite 
z mod p(32) as 5a -I- b. This means that the calculation y = ky + m mod p(32) 
can be done by computing y = 5a + b + m mod p(32), which can be implemented 
more cheaply than the original approach because it does not require division to 
perform the modular reduction. 

Key-Set Selection. When implemented on a 32-bit architecture, the values 
a, b and m just discussed fit conveniently into 32-bit registers, making these 
quantities easy to manipulate. On most such architectures, the calculation of y 
is going to be fastest if it is done with minimal register overflow. To calculate 
y = 5a + b + m mod p(32) using only 32-bit registers, we need one multiplication, 
two additions and then some additional instructions to handle register overflow. 
Each operation that can result in register overflow requires several instructions, 
including a conditional move or branch, to check and deal with the potential 
overflow event. To accelerate the calculation of y we reduce the number of po- 
tential overflows. Little can be done about overflow from the additions because 
both b and m can be nearly 2^^, but overflow from the multiplication can be 
eliminated. Only if a is larger than [2^^/5j « 2^®-^ can the term 5a overflow a 
32-bit register. We can restrict a to safe values by restricting k to values less 
than 2^®. This allows for a faster implementation. The expense for this optimiza- 
tion is a higher collision probability because the key is chosen from a set of 2^® 
elements instead of a set of 2^^ elements. 

Divisionless modular reduction. Another optimization over a naive imple- 
mentation is the elimination of division to calculate modular reductions. This 
technique is not new. In calculating y = 5a + b + m mod p(32), each of the 5a, 
b and m terms are less than 2^^. As we sum them using computer arithmetic 
with 32-bit registers, we can easily detect 32-bit overflows. Each such overflow 
indicates a 2^^ term which is not accounted for in the resulting register. But, 
because 2^^ = 5, these overflows are easily accounted for by adding 5 for each 
overflow to the resulting register. Done carefully, this observation results in a 
number y, derived without any division, which is representable in 32-bits (ie. 
0 < J/ < 2^^). See Figure 3 for implementation details. Do we then need to re- 
duce y to a number in Zp( 32 )? No. All of the discussion so far requires only that 
y be representable in a 32-bit register. Instead of reducing y to be in Zp( 32 ) after 
every intermediate calculation, we defer all such reductions until the end, when 
a final single reduction is performed. 
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; Calculate 


y = y * k + m 


mod p(32) 


; Assume y 


is in register 


eax 


before and after code segment. 


mul 


k 






edx : eax = k + y 


lea 


edx. 


[edx*4+edx] 




edx = 5 * edx 


add 


eax. 


edx 




eax = edx + eax 


lea 


edx. 


[eax+5] 




edx = eax + 5 


cmovc 


eax, 


edx 




if (carried) then eax = edx 


add 


eax. 


m 




eax = eax + m 


lea 


edx. 


[eax+5] 




edx = eax + 5 


cmovc 


eax. 


edx 




if (carried) then eax = edx 



Fig. 3. The y = ky + m calculation of the PolyP32 algorithm written in Pentium II 
assembly. The Bag “carried” is true only if the previous add instruction causes a reg- 
ister overflow. The conditional-move instruction (cmovc) is used to avoid any branches 
during execution of the routine, and the load-effective-address instruction (lea) is used 
for addition and multiplication of small constants. The result of the routine could pos- 
sibly be in the range p < y < 2®^, which is outside of the field Zp( 32 ), but this is easily 
fixed with a single subtraction after hashing the final word of the entire message. 



Speed. Taken together, the selection of a convenient prime field and the re- 
striction of the key-set to keys which eliminate some register overflows allows a 
nice speed-up over a naive implementation of PolyCW. Figure 2 shows a version 
of polynomial hash based upon PolyCW which hashes over the field Zp( 32 ) and 
restricts key selection to the set K 32 = {a : 0 < a < 2^®}. Our implementation 
of the core y = ky -\- m mod fl(32) calculation uses just 8 lines of Pentium II 
assembly (Figure 3) and achieves a peak throughput of 3.69 cpb. 

We state here the (simple) proposition establishing the collision bound of the 
PolyP32 hash function. 

Proposition 1. For any positive n and distinct messages m = (mo,...,m„) 
and m' = (mQ,...,m(J, consisting of elements from ^p(z 2 ), the probability 
Pr[fc ^ K 32 '■ PolyP32(fc, m) = PolyP32(/c, m')] is no more than n/|iC 32 | = 
n2“^®. 

64-Bit Hashing and Key Restriction. We also implemented an analogous 
PolyP64 hash function whose core calculation is y = ky -\- m mod p(64) where 
p(64) = 2®"* — 59 and k, y and m are all elements of Zp(g 4 ). As in the 32- 
bit case, it is cheapest to calculate the result without using division. If we let 
2^“^kh + kt represent k and 2^‘^yh + yi represent y, then ky can be calculated as 
ky = 2^‘^khyh + 2^'^{khyi + kiyh) + kiyi. Again, restricting the set of values that k 
can take on allows for faster implementations by eliminating some 32-bit register 
overflows. We define key-set Kq 4 = {a2^^ -F 6 : 0 < a, 6 < 2^®}. This restriction 
allows an implementation of PolyP64 which has a collision probability of (n/2®°), 
uses 40 lines of assembly and has a peak throughput of 6.86 cpb. 
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algorithm PolyQ32(fc, M) 

// Input: k G Kz 2 and M G ({0, 1}32)+. 
// Output: y G Zp( 32 ). 

p ^ 2^2 - 5 

offset <— 5 
marker <— 2®^ — 6 
n ^ |M|/32 
Ml II ... II M„ ^ M, 

where |Mi| = • • • = |M„| = 32 

y ^ 1 

for i <— 1 to n do 

m ^ str2num(Mi) 
if (m > p — 1) then 

y ^ ky + marker mod p 
y ^ ky + (m — offset) mod p 

else 

y ^ ky + m mod p 

return y 



// Largest prime smaller than 2®^ 

// For translating out-of-range words 
// For marking out-of-range words 

// Break M into 32-bit ehunks 

// Set highest coefficient to 1 

// If word is not in range, then 
// Marker indicates out-of-range 
// Offset m back into range 

// Otherwise hash in-range word 



Fig. 4. The PolyQ32 algorithm. The PolyP32 hash extended to hash strings instead 
of vectors of held elements and to allow good collision probabilities over two strings 
which differ in length. 



4 Expanding the Domain to Arbitrary Strings 

The hash function PolyP32 is not generally useful. It only works on same-length 
messages, and those messages must be made of elements from the field Zp( 32 ). 
We now remove these limitations and develop PolyQ32. The result, depicted in 
Figure 4, hashes most messages at a rate of 3.86 cpb. 

Allowing Variations in Length. It is a trivial exercise to produce two 
different-length messages which collide when hashed with PolyP32 using any 
key: under PolyP32, the hash of a message m = (mo,...,m„) using key k is 
simply h{k, m) = mo/c" -P • • • -P mod p(32), so prepending 0 to the vector 

m results in a message m' = (0, toq, . • . , m„) which is hashed as h{k,m') = 
-P TOo/c” -P • • • -P ninkP modp(32) and is equal to h{k,xa) because the 
additional zero-term has no effect on the hash value. For the Fundamental The- 
orem of Algebra to guarantee a low number of roots (and hence a low collision 
probability), it is essential that the difference between m and m' be non-zero. 
This means that if the two vectors differ only in length, then at least one of 
the initial elements of the longer vector must be non-zero. To guarantee this 
we employ a standard trick and implicitly prepend a “1” to the vectors being 
hashed. Thus, the hash of m = (mo, . • . , m„) implicitly becomes the hash of 
m = (1, mo, . . . , m„), and the hash of m' = (0, mo, . • . , m„) implicitly becomes 
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the hash of m' = (1, 0, toq, . . . , m„). The difference between these two vectors is 
non-zero. The following theorem assures that augmenting PolyP32 in this way 
results in a hash with nearly the same collision probability as PolyP32, but works 
over messages of different lengths. 

Proposition 2. Let i < n be positive integers. Let m = (mo, . . . , mi) and m' = 
(mj), . . . , m(j) he any two vectors of elements from the field F. Then there are at 
mostn+l values for k G F such that m[kT~'' . 

Proof. Beginning with + J2i=o + X^r=o and moving 

all of its terms to the right side of the equation we get 0 = — k^~^^ + 

X)”=o But, the right side of this equations is now a non- 

zero polynomial, is of degree n -F 1 , and therefore has at most n -F 1 roots. <0 

Alternative Method. Another method of augmenting PolyP32 to allow vari- 
able length messages is to use a second key k' G ’^p{ 32 ) and add it to each 
element of the message being hashed. Thus, h{k, k' , m) would be computed as 
+ k')k^~’‘. This method requires an extra addition per message word 
being hashed and so the first method seems favorable. 

Allowing Bit-Strings. To make the function PolyP32 of Figure 2 more useful, 
it must be adapted to allow bit-strings rather than only vectors from Zp( 32 ) . The 
field Zp( 32 ) was chosen because it contains nearly all the numbers representable 
as 32-bit strings. Thus, when we desire to hash a bit-string, we may partition the 
string into 32-bit words and treat the partition as a vector of 32-bit numbers. 
PolyP32 can then hash the vast majority of the vector’s elements without any 
modification. But, some of the 32-bit numbers may be in the range p(32) . . . 2^^ — 
1, outside Zp( 32 ). What should be done with them? 

One approach is to transform a vector of 32-bit numbers, which may have 
some elements outside of Zp( 32 ) , into a vector which does not. The transformation 
must map distinct vectors into distinct vectors. 

We solve this problem by examining a vector of 32-bit numbers and replac- 
ing each vector element mi that is greater than p(32) — 2 with two numbers, 
p(32) — 1 and mi — 5. Note that both of these numbers are in Zp( 32 ). Each such 
replacement lengthens the resulting vector by one element. Thus, the vector 
m = (4, 2^^ — 3, 10), whose second element is greater than p(32) — 2, would be 
transformed into the vector m' = (4, 2^^ — 6 , 2^^ — 8 , 10). We call this transforma- 
tion DoubleTransform : (Z 232 )+ ^ (Zp( 32 ))“'". The following proposition assures 
that DoubleTransform is correct. 

Proposition 3. For positive i and n, and distinct messages m = (mo, . . . , mi) 
and m' = (mo,...,m(j) made of elements from Z 232 , the transformed vectors 
DoubleTransform(m) and DoubleTransform(m') consist of elements from Zp( 32 ) 
and are distinct. 

Proof. Let £ and n be positive integers and let m = (mo, . . . ,me) and m' = 
(mQ,...,m(j) be distinct vectors consisting of elements from Z 232 . Let t = 
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DomainTransform(m) and t' = DomainTransform(m'). Let i be the smallest num- 
ber such that rrii m'. If such an i does not exist then one of m or m' must be 
a proper prefix of the other. In this case, any lengthening of the shorter vector 
by DoubleTransform must be mirrored by the transformation of the longer vector 
ensuring that the two remain different lengths after transformation. 

If rrii and m' are both less than p(32) — 1, then after transformation ti = rrii 
and t' = m', ensuring that t yf t'. If only one of rrii and m' is less than p(32) — 1, 
say rrii, then after transformation ti = rrii and t' = p(32) — 1, again ensuring 
that t yf t'. Finally, if both rrii and to' are greater than p(32) — 2, then after 
transformation ti+i = rrii — 5 and = rrii — 5' again ensuring that t yf t'. 

Alternative Method. There are many ways to patch PolyP32 to allow out- 
of-range elements. One probabilistic alternative is to offset every out-of-range 
number by a randomly chosen k' € {5, . . . , 2^^ — 5}. All out-of-range numbers are 
in {2^^ — 5, . . . , 2^^ — 1}, so fc', when subtracted from an out-of-range number, will 
always yield a number in Zp( 32 ) . This method has the advantage of not increasing 
message length upon transformation, but requires an extra key element, and in 
practice does not speed hashing with respect to the method of Proposition 3. 
Together, Propositions 2 and 3 prove the following corollary. 

Corollary 1. For any positive integers £ < n and distinct messages M € 
{0,1}32' g {0,1}32«, Prj^g^^JPo|yQ32(fc, M) = PolyQ32(/c,M')] < 

2n/|A32| = n2-28. 

The discussion so far has focussed on PolyQ32, a hash function defined on 32-bit 
words. An analogous 64-bit variant, PolyQ64, yields the following bound. 

Corollary 2. For any positive integers £ < n and distinct messages M G 
{0,1}®-*' and M' G {0, PrfcgK 64 [PolyQ64(fc, M) = PolyQ64(/c, M')] < 

2n/|A64| = n2-4®. 

Two things are worth noting. First, the factor of two introduced in the 2n/|A32| 
term is due to the potential doubling of message length by the DoubleTransform 
function. And, second, standard message padding techniques are not addressed 
in this paper. It is assumed that messages being hashed have been properly 
padded to a 32-bit boundary. 

It should also be noted that the probability that PolyQ32 or PolyQ64 hash any 
message to a particular result is also low. Consider a message made of n 32-bit 
words X = (xi, . . . , Xn) and a constant c. If c > p(32) then PolyQ32(/c, x) cannot 
hash to c, and if c < p(32), then PolyQ32(/c, x) will hash to c only if PolyQ32(fc, x') 
hashes to zero where x' = (xi, . . . , x„ — c). After the DoubleTransform transfor- 
mation of x', the Fundamental Theorem tells us that there are no more than 2n 
keys which allow this to happen. 

Claim. Let n and c be numbers, and let message M be an element from {0, 1}®^^, 
then Prfegif32[PolyQ32(/c,M) = c] < 2 n/|AT 32 |. 
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algorithm PolyR32_64(k, M) 

// Input: k = (fci, k 2 ) with ki € Kz 2 and /c 2 G Kq4, and M £ {0, 1}*. 

// Output: Y £ {0,1}®^ 

if (|M| < 2^^) then // 2® 32-bit words 

M ^ padonezero(M, 32) 

y ^ PolyQ32(fci, M) // Hash in 1^p(z2) 

else if (|M| < 2®®) then // 2®® QA-hit words 

Ml ^ 2^*] 

M2 ^ M[2^* -£ 1 . . .\M\] 

M 2 ^ padonezero(M 2 , 64) 

y «— PolyQ32(fci, Ml) // Hash in 1^p(z2) 

y PolyQ64(fc2, num2str(j/, 64) || M 2 ) // Hash in 'Zp( 64 ), prepending y 

else 

return Error // Message too long 

Y <— num2str(y, 64) // Convert to string 

return Y 



Fig. 5. The PolyR32_64 algorithm. Combining the PolyP32 and PolyP64 hashes into 
a hash function which is fast on short messages but also performs well on long ones. 
PolyR32_64 also extends the domain to messages which are not a multiple of the con- 
stituent hashes word-lengths. 



5 PolyR: Overcoming Polynomial Hash Length 
Limitations 

Taking a closer look at the bounds established for each of the polynomial hash 
functions, one can see that the collision bounds degrade linearly along with the 
length of the messages being hashed. This is a byproduct of the use of polyno- 
mials in hashing: As messages get longer, so do the degrees of the polynomials 
get higher, resulting in more potential collision-causing roots. This introduces a 
trade-off in application design. If one wants to guarantee some maximum colli- 
sion probability e and the hash-key is chosen from a set of k elements, then the 
length of messages to be hashed must be limited to around ke words. The larger 
the key-set size k used in the hashing polynomial, the more words can be hashed 
before reaching the allowable collision probability q. But, to make the key-set 
size significantly larger requires the polynomial to be computed over a larger 
prime-field, and in general, as the prime p is increased, so is the time needed to 
evaluate the polynomials in Zp. As one can see by examining the timing results 
for PolyQ32 and PolyQ64, the move from a prime close to 2^^ to one close to 2®^ 
increases the number of cycles-per-byte required to hash a message by nearly 
50%.2 

® Some of this difference is an artifact of the fact that the Pentium II natively supports 
multiplication of 32-bit operands to a 64-bit result, but not the multiplication of 64- 
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Can we have the best of both worlds: a hash function which is as fast as 
PolyQ32 but can hash messages as long as PolyQ64, without having intollerably 
high collision probability? This is the goal which motivates this section. We 
approach the problem with the belief that most strings being hashed are short, 
but that a generalized hash function should be able to handle well long messages 
too. 

Our idea is to hash short messages (up to some fixed number of bits £) directly 
with PolyQ32, but hash messages longer than £ bits with a hybrid scheme. Let 
us say that message M is longer than £ bits. To hash M we first partition it 
into its £ bit prefix Mi and the remainder M 2 , so that Mi || M 2 = M. The hash 
of M under our hybrid scheme is then PolyQ64(fc27 PolyQ32(fci, Mi) || M 2 ). In 
this manner, the first £ bits of M is hashed with a fast hash function (which 
cannot safely hash long messages), and if there is any of the string left after 
hashing its prefix, the remainder is hashed with a slower hash function (which 
can safely hash longer messages). The parameter £ depends on the maximum 
desirable collision bound and how long a message can be before the fast hash 
function approaches this bound. 

As an example, let us say that we want to hash messages and have a collision 
bound of no more than 2“^®. If we were to hash solely with PolyQ32, then we 
could hash no messages longer than around 2^^ bits. Alternatively, we could 
hash with only PolyQ64 and would then be able to hash strings as long as 2^® 
bits before allowing 2“^® collision probability, but at a much slower rate than 
PolyQ32. Under our scheme, if a message M is shorter than 2^^ bits, then the 
hash result is simply PolyQ32(M); whereas if M is longer than 2^^ bits, then 
the hash is calculated as PolyQ64(PolyQ32(Mi) || M 2 ) where Mi is the 2^^-bit 
prefix of M. Such a construction is fast on short messages, but handles well long 
messages too. If messages were anticipated to be longer than 2^®, then a function 
PolyQ96, employing a 96-bit prime modulus, could be defined analogously and be 
employed as a third-stage polynomial. This ramping-up of the prime modulus 
used in the polynomial evaluations gives the construction its name: Ramped 
polynomial hashing. 

One might expect the collision bound of such a hybrid approach to be ap- 
proximately the sum of the collision bounds of each of its constituent functions, 
but as the following theorem shows, the overall collision bound is instead only 
the maximum of the functions. 

The following theorem and proof address PolyR32_64, the ramped polynomial 
hash of Figure 5. This concrete hash function hashes up to 2^'^ bits (equivalent 
to 2® 32-bit words) using the fast PolyQ32 function, and allows a total message 
length of up to 2^® bits (or 2^® 64-bit words) . In the following theorem and proof, 
for increased generality, we use parameters £ and m instead of the numbers of 
words 2® and 2^®. 



bit operands to a 128-bit result. Most processors will display this type of threshold 
behavior when operands exceed well-supported lengths. 
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Theorem 1. Let 1 = 1? and m = Let M ^ M' be messages no longer than 

64m bits. Then the probability Pr[fci K 32 ', ^2 Kqa ■ PolyR32_64(fci, ^ 2 , M) = 

PolyR32_64(fci, fe, M')] is no more than max(£2“^®, m2“^®) + 2“®°. 

Proof. Let M and M' be messages, and imagine partitioning them into M = 
Ml II M 2 and M' = M[ || so that M\ and M[ are the first 32f bits of M and 
M' . If M is shorter than 32f bits, then M\ = M and M 2 is empty. Likewise, 
if M' is shorter than 32f bits, then M[ = M' and M^ is empty. We ignore all 
padding issues in this discussion, assuming that standard padding techniques are 
used to bring M and M' to appropriate lengths. Let fci K 32 and ^2 
be randomly chosen keys. We define the following values here for convenience, 
but all probabilities in this proof are assumed to be taken over these random 
choices of ki and fe. 

hi = num2str(PolyQ32(fci, Ml), 64) and h[ = num2str(PolyQ32(A:i, M{), 64) 
/i 2 = num2str(PolyQ64(fc2, ^1 || M 2 ), 64) and h'^ = num2str(PolyQ64(A:i, M^), 64) 

Depending on the lengths of M and M', the result of hashing M will be hi or 
/i 2 and the result of hashing M' will be h[ or h' 2 - We examine several cases for 
the relative lengths of M and M'. 

Case 1: Different lengths, same ramp. Here we examine the case where 
the messages M and M' are different lengths, but are both either longer than 
32£ bits or both no longer. If both are no longer than 32f bits then a collision 
occurs if hi = h'l - But, Mi = M and M{ = M' differ in length which means (by 
Proposition 1) that Pr[/ii = h'l] < f2“^®. If both M and M' are longer than 32f 
bits then a collision occurs if /12 = h' 2 - But, hi || M 2 and h'l || M^ also differ in 
length which means (by Proposition 2 ) that Pr[ft ,2 = ^ 2 ] — w 2 “^® 

Case 2: Different lengths, different ramp. If M is longer than 32f bits 
and M' is not, then a collision occurs only if h'l = ft- 2 - Expanding the ft -2 term, we 
see that a collision only occurs if h[ = num2str(PolyQ64(/c2j || M 2 ), 64). If we 
fix ki to an arbitrary value, then hi and h'l become fixed as well, and the proba- 
bility of collision then depends only on the selection of k 2 - The string hi || M 2 is 
partitioned by the PolyQ64 algorithm into 64-bit strings and then transformed by 
DoubleTransform into some sequence xq,xi, . . . ,x„< 2 m of elements from Zp(g 4 ). 
This sequence is then used in the summation a^i ^2 p(64) to calcu- 

late the final hash result. A collision occurs if the result of this summation is h[, 
or alternatively when — h[ mod p(64) = 0. The Fundamental Theo- 

rem of Algebra applies to this last polynomial, meaning there are no more than 
n < 2m values for k 2 which satisfy it. Thus, Pr[/i( = ^, 2 ] < 2m2“^® = m2“^®. 

Case 3; Equal length messages, last ramp different. If M and M' 
are equal length, longer than 32£ bits and M 2 yf M 2 , then (by Proposition 2) 
Pr[/i 2 = ^- 2 ] ^ m2“^® because hi || M 2 and h'l || M^ are distinct. Similarly, if M 
and M' are the same length, no longer than 32^ bits and Mi yf M[, then (by 
Proposition 1) Pr[/ii = / 14 ] < m2“^® because Mi and M[ are distinct. 
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Case 4: Equal length messages, last ramp same. If M and M' are equal 
length and longer than 32£ bits, and Mi yf M[ but M 2 = M^, then there 
are two opportunities for a collision to take place. First, if PolyQ32(fci, Mi) = 
PolyQ32(fci, M{), then the strings hi || M 2 and h'l || M 2 are equal, guaranteeing 
that /i 2 and h '2 collide. The probability of this event is no more than ^2“^®. 
Second, if hi yf /i'^, then a collision can still occur if PolyQ32(/c2, || Mi) = 
PolyQ32(A:2 7 ^^2 II -^ 2 )- C>ne might think that this is an event with up to m2“^® 
probability, but it is not. Because M 2 = M 2 , the strings hi || Mi and h\ || M{ 
only differ in their first 64-bit word. The collision event when hashing such strings 
takes the form {hi — h'i)k 2 = 0, which can only be satisfied if /c 2 = 0, a 2“®® 
probability event. Thus, the total probability of collision in this case is bounded 
by ^2-28 -P2-50. 0 

5.1 Security Notes 

If a lower collision probability is desired, one can hash messages multiple times, 
using a different key for each message hash. A hash function which has an e 
collision bound when hashing once with a random key, has an collision bound 
when hashing twice with two random keys, and an e® collision bound when 
hashed with three keys, etc. 

Also, all of the theorems in this work have been stated in terms of colli- 
sions (ie. the difference between the result of evaluating the hash of two distinct 
messages is zero). It is a simple matter to tweak the algorithms and proofs to 
show that the probability that the difference between the hash of two distinct 
messages being a particular constant is bounded by the same e. This version of 
universal hashing (“delta”-universal) is required in some message authentication 
schemes. 
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A Fully Parameterized: PolyQ, PolyR 



The body of this paper developed a two-stage ramped polynomial hash function 
PolyR32_64 using polynomials over 32- and 64-bit prime fields. The concrete 
choices made for PolyR32_64 were designed especially for a message authenti- 
cation code. In the MAC, we needed a universal hash function which would 
guarantee a collision bound of at most 2“^® and would typically be applied to 
messages no longer than a few dozen bytes. But, the hash function must also 
be able to process huge inputs, too, and still guarantee a bound of at most 
2-16 yj^ggg requirements led to the development of ramped polynomial hashing 
in general, and in the choice of the 32- and 64-bit prime fields, and associated 
crossover points, used in the body of this paper. 

Other collision bounds and message lengths not addressed by PolyR32_64 
are likely, and so we present in this appendix fully parameterized versions of 
the hashes called PolyQ and PolyR. For each of the algorithms we state their 
collision bounds as a theorem, but give no proofs. The proofs are straightforward 
extensions of those given in the body of the paper. 



Proposition 4. Let v be any positive integer, let K C Zp(„) he any subset of 
points in the field and let 2““^ < d < p{v). For any positive integers 

£ < n and distinct messages M e {0, 1}^’' and M' e {0, 1}™, Pr[/c K : 
Po\yQ[K, V, d]{k, M) = Po\yQ[K, v, d\{k, M')] < 2n/\K\. 

Proposition 5. Let all of the parameters from Figure 6 he fixed. For any dis- 
tinct messages M and M', each shorter than bits, Pr[k K : 

PolyR(k, M) = PolyR(k, M')] < 



max 

l<z<r 




1 

m ■ 



r > 1 




: Length of v, 1, K vectors used in PolyR. 


v = (wi,.. 


. ,Vr) 


: Word-lengths used in PolyR, with 1 < v\ < ■ ■ ■ < Vr. 


1= (ti,... 


Jr) 


: Message lengths used in PolyR, with li> 1 for 1 < i < r. 


d = (di, . . 


•i dr) 


: Domain bounds used in PolyR, with 2“*“^ di < p{vi). 


K= (Ai, 


....Kr) 


: Key-sets used in PolyR, with Ki C for 1 < i < r. 



Fig. 6. Parameters used in the fully parameterized PolyR algorithm. Fixing these 
parameters fixes the algorithm definition specified in Figure 8. 
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algorithm Po\yQ[K, v, d]{k, M) 

// Parameters: “Key set” K C Zp(„), 

// Input: k£K and M £ ({0, 1}“)+. 

// Output: y £ 

offset ^ 2“ - p{v) 

marker <— p{v) — 1 

n <— \M\/v 

Ml II ... II M„ ^ M, 

where |Mi| = . . . = |M„| = v 
y^l 

for i ^ 1 to n do 

m ^ str2num(Mi) 
if (m > d) then 

y ^ ky + marker mod p(v) 
y ^ ky + (m — offset) mod 

else 



“word length” v> 1, “domain-bound” d. 

// For translating out-of-range words 
// For marking out- of -range words 

// Break M into word size chunks 

// Set highest coefficient to 1 

// If word is not in range, then 
// Marker indicates out-of-range 
v) // Offset m back into range 



y ^ ky -\-m mod p(v) 

return y 



// Otherwise hash in-range word 



Fig. 7. The PolyQ algorithm, parameterized on keyset K, word-length v and domain- 
bound d. 



algorithm PolyR(k, M) 

// Parameters: Uses “vector length” r, “word-length vector” v, 

// “message-length vector” 1, “domain-bounds vector” d, “keyset vector” K. 
// Input: k = (fci, . . . , kr) with ki £ Ki for 1 < i < r and M £ {0, 1}*. 

// Output: Y £ {0, 1}“’'. 

// Initially no string to prepend 
// Index for v, 1, K vectors 
// While multiple ramp-levels remain 
// Message too long 
// Extract string to hashed under p{vi 



prepend <— e 
i ^ 1 

while (|M| > iiVi) do 

if (i = r) then return Error 

T ^ M[1 . . . tiVi] 

M ^ M[£iVi + l...\M\] 
y ^ Po\yQ[Ki,Vi,di]{ki,prepend || 
prepend r- num2str(i/, Ui+i) 
i <— i -\- 1 

M ^ padonezero(M, Ui) 
y <— Po\yQ[Ki,Vi,di]{ki,prepend || M) 
Y num2str(y, Vr) 
return Y 



T) // Hash in Zp(„^), prepend previous 
// Update prepend for next ramp-level 

// Final ramp needs bijective padding 
// Hash in Zp(„^), prepend previous 
// Convert to string 



Fig. 8. The PolyR algorithm. Parameters are described in Figure 6. 
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Abstract. Elliptic curve cryptosystems([19,25]) are based on the elliptic 
curve discrete logarithm problem(ECDLP). If elliptic curve cryptosys- 
tems avoid FR-reduction([ll,17]) and anomalous elliptic curve over F, 
([34,3,36]), then with current knowledge we can construct elliptic curve 
cryptosystems over a smaller definition field. ECDLP has an interest- 
ing property that the security deeply depends on elliptic curve traces 
rather than definition fields, which does not occur in the case of the dis- 
crete logarithm problem(DLP). Therefore it is important to characterize 
elliptic curve traces explicitly from the security point of view. As for 
FR-reduction, supersingular elliptic curves or elliptic curve E/¥q with 
trace 2 have been reported to be vulnerable. However unfortunately these 
have been only results that characterize elliptic curve traces explicitly for 
FR- or MOV-reductions. More importantly, the secure trace against FR- 
reduction has not been reported at all. Elliptic curves with the secure 
trace means that the reduced extension degree is always higher than a 
certain level. 

In this paper, we aim at characterizing elliptic curve traces by FR- 
reduction and investigate explicit conditions of traces vulnerable or se- 
cure against FR-reduction. We show new explicit conditions of ellip- 
tic curve traces for FR-reduction. We also present algorithms to con- 
struct such elliptic curves, which have relation to famous number theory 
problems. Keywords: elliptic curve cryptosystems, trace, FR-reduction, 
number theory 



1 Introduction 

Koblitz and Miller proposed independently a public key cryptosystem based 
on an elliptic curve E defined over a finite field {q = p’’) ([19,25]). If ellip- 
tic curve cryptosystems satisfy so called FR-conditions ([24,11,17]) and avoid 
anomalous elliptic curve over ([34,3,36]), then the only known attacks are 
the Pollard p-method ([27]) and the Pohlig-Hellman method ([26]). Hence with 
current knowledge, we can construct elliptic curve cryptosystems over a smaller 
definition field than the discrete logarithm problem (DLP)-based cryptosystems 
like the ElGamal cryptosystems ([13]) or the DSA ([12]) and RSA cryptosys- 
tems ([28]). Elliptic curve cryptosystems with a 160-bit key are thus believed to 
have the same security as both the ElGamal cryptosystems and RSA cryptosys- 
tems with a 1,024-bit key. Recently some researches on comparing MOV and 
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FR-reductions have been reported in [15,18]. These attacks imbed a subgroup 

< G > C E(¥g) to F*;, for an extension field F^ic and reduce ECDLP based on 

< G >C E(¥q) to DLP based on a subgroup of F*^, where G € E(Fg) is called 
a basepoint for ECDLP. MOV-reduction reduces ECDLP to DLP by using the 
Weil pairing ([35]). Supersingular elliptic curves ([35]) have been reported to be 
vulnerable against MOV-reduction, which can be easily recognized by the trace t 
of the g*^-power Frobenius endomorphism, t = q+1 — ^E(¥q): an elliptic curve 
is supersingular if and only if f = 0 (mod p). On the other hand, FR-reduction 
reduces ECDLP to DLP by using the Tate pairing. FR-reduction can attack 
elliptic curves with trace 2 in addition to supersingular elliptic curves. In fact, 
these have been only results that characterize elliptic curve traces explicitly from 
a point of view of FR- and MOV-reductions. It is interesting that in the case 
of E/¥p over a prime field, dangerous elliptic curve traces happen to be equal 
to 0 (supersingular), 1 (anomalous) and 2, which can be easily recognized from 
other elliptic curves. Thus ECDLP has an interesting property that the security 
deeply depends on elliptic curve traces rather than definition fields, which does 
not occur in the case of DLP. Therefore it is important to characterize elliptic 
curve trace from the security point of view. 

Balasubramanian and Koblitz investigate that extension degrees required to 
apply both reductions for ECDLP on G € £'(Fg) with order n are the same 
if n /](7 — 1([4]). Therefore without loss of generality we deal with only FR- 
reduction. By FR-reduction, ECDLP on G G E(Fg) with order n is reduced 
to DLP on F*fc if and only if — 1. The probability that elliptic curves are 
vulnerable against FR-reduction, i.e. the extension degree k is small, is shown to 
be highly unlikely ([4]): FR-reduction is considered not to be threat in a realistic 
sense. Nevertheless all but supersingular and trace 2 elliptic curves have not 
been proved to be secure in a sense that they are strong against FR-reduction. 
There might exist another trace of elliptic curves which is reduced to at most 6, 
seriously low, degree extension field, whose trace might not be simple like 0 or 
2. In fact, supersingular elliptic curves have rather special properties compared 
with ordinary elliptic curves([35]), which is thought to cause such a weak factor. 
However also in the case of ordinary elliptic curves, non-special elliptic curves, 
there might exist elliptic curve traces with a weak factor. 

More importantly, the secure trace against FR-reduction has not been re- 
ported yet. Elliptic curves with the secure trace means that the reduced exten- 
sion degree is always higher than a certain level. This means that the security 
of ECDLP over E/¥g is guaranteed by the security of widely known DLP on 
F*fc with higher k than a certain level since FR-reduction gives an isomorphism 
between ECDLP over E/¥g and DLP based on a subgroup of F*;,([20]). In an- 
other light, the secure trace against FR-reduction is useful for construction of 
elliptic curve cryptosystems. Let’s consider the following requirements: it is de- 
sirable that a domain parameter such as an elliptic curve or a basepoint should 
be chosen independently by each entity or by each application in order to keep 
security high([l]), and that such an initialization could be done more easily over 
lower CPU power or smaller memory like a smart card. In such requirements, 
it would be certainly desirable that an elliptic curve is constructable at least 
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as easy as generating a prime number, which is a dominant step of RSA-key 
generation([28]). This is why explicit conditions of secure elliptic-curve traces is 
useful since we can construct easily an elliptic curve with a given specific trace. 
Apparently SEA algorithm([30,32,7,10]) is not suitable since it requires rather 
large memory. 

In this paper, we aim at characterizing elliptic curve traces by FR-reduction 
and investigate explicit conditions of traces vulnerable or secure against FR- 
reduction. Here we summarize our results on new explicit conditions of elliptic 
curve traces against FR-reduction. 

• Let E /¥q be an elliptic curve with prime order and the trace t. 
o Theorem 2: ECDLP on E/¥q is reduced to DLP on F*a by FR-reduction 
(i) {q,t) can be represented hy q = 12/^ — 1 and t = — 1 ± 6Z (Z G Z), or 
(ii) {q,t) can be represented by g = p’’ (r is even) and t = Ey/q (i.e. supersin- 
gular elliptic curves). 

o Theorem 3: ECDLP on F/F^ is reduced to DLP on F *4 by FR-reduction 
(i) {q, t) can be represented hy q = P + I + 1 and t = —I, I + 1 (I € Z), or 
(ii) (q,t) can be represented by q = 2’’ (r is odd) and t = Ey/2q (i.e. supersin- 
gular elliptic curves). 

o Theorem 4: ECDLP on F/F^ is reduced to DLP on F*e by FR-reduction 
(i) {q,t) can be represented by g = 4^^ -I- 1 and t = 1E21 {I & Z), or 
(ii) {q,t) can be represented by g = 3’’ and t = ±i/3g (r is odd) (i.e. supersin- 
gular elliptic curve). 

Up to the present, it has not been reported whether there exist another ellip- 
tic curve trace, except supersingular and trace 2, reduced to at most 6-degree 
extension field or not. However, our explicit conditions mean that prime-order 
elliptic curves are reduced to at most 6-degree extension field if and only if they 
satisfy at least one of conditions of Theorems 2, 3 and 4. 



• Let ECDLP on F(Fg) with the trace t be reduced to DLP on F**,. 
o Theorem 5: If t > 3, then the extension degree k satisfies 



k > 



logg 

log (t - 1) 



where e is a real number such that A > g > q. 
o Corollary 4: Let t = 3. Then the extension degree k satisfies 



k > log q — e. 

Theses are the first explicit elliptic-curve-trace conditions on which reduced 
extension degrees are always higher than a certain level. In the case of E/¥p, 
dangerous elliptic curve traces happen to be equal to 0, 1 and 2. To the contrary, 
our result shows that E/¥p with trace 3 is secure against FR-reduction. 

Furthermore, we present an algorithm to construct elliptic curves with the 
above conditions and present some examples. 

This paper is organized as follows. Section 2 summarizes MOV- and FR- 
reductions. Section 3 investigates new explicit conditions vulnerable or secure 
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against FR-reduction by showing Theorem 2, 3, 4, and 5. Section 4 shows al- 
gorithms to construct elliptic curves with new explicit conditions. Section 5 
presents some examples. 



2 MOV-Reduction and FR- Reduction 

In this section, we summarize MOV- and FR-reductions against ECDLP on 
G G E{¥g) with order n. Here the n-torsion subgroup is denoted by E[n] = 
{P G E I nP = O}. We compare MOV-reduction with FR-reduction. In MOV- 
reduction, ECDLP on G is reduced to DLP for the smallest integer k such that 
E[n] C E(¥gk). Thus supersingular elliptic curves can be efficiently reduced to 
F*fc for fc < 6. On the other hand, in FR-reduction ECDLP on G is reduced to 

DLP for the smallest integer k such that n\q’^ — 1. If E[n] C E{¥gk), then n\q’^ — 1 
([31]). Therefore such an elliptic curve vulnerable against MOV-reduction is also 
vulnerable against FR-reduction. In fact FR-reduction works also for elliptic 
curves with trace 2 efficiently in addition to supersingular elliptic curves. 



Table 1. Known explicit conditions for FR-reduction 





trace(A) 


extension degree 


p ^ 1 (mod 4) if r is even 


0 


2 


p ^ 1 (mod 3) if r is even 




3 


p = 2 and r is odd 




4 


p — 3 and r is odd 




6 


r is even 


±2^5 


1 


Vq 


2 


1 



Balasubramanian and Koblitz ([4]) show that if n is a prime and n /fg — 1, 
then 



E[n] C E{¥gk) ^ n\ q'^-1. 

As a result there is no difference between MOV-reduction and FR-reduction 
except elliptic curves with trace 2. Without loss of generality, we deal with the 
only FR-reduction in this paper. 

Table 1 summarizes known explicit conditions of elliptic curve traces for FR- 
reduction, where the extension degree k means that ECDLP on E{¥g) is reduced 
to DLP on a subgroup of F*^ . 

As for the probability such that ECDLP is reduced to the lower degree exten- 
sion field by FR-reduction, Balasubramanian and Koblitz show the next theorem. 



Theorem 1 ([4]). Let (p,E) be a randomly chosen pair of a prime p in the 
interval M /2 < p < M and an elliptic curve E/¥p with prime order n. The 
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probability Pr of n\p^ — 1 for some k < (logp)^ satisfies 



Pr <C 



(log M)®(log log M)^ 
M 



for C > 0. 



Theorem 1 says that FR-reduction is highly unlikely to be efficient attack against 
ECDLP. However we note that Theorem 1 does not describe whether there 
might exist another explicit criterion of an elliptic curve trace vulnerable or 
secure against FR-reduction or not. From Table 1, we see that such an explicit 
condition that gives the extension degree higher than a certain level has not been 
reported. 



3 New Explicit Conditions for Elliptic Curve Traces 

In this section, we investigate new explicit conditions of elliptic curve traces for 
FR-reduction. Table 2 shows our results, which will be discussed in the following 
sections. 



Table 2. New explicit conditions for FR-reduction 





t = trace(E) 


extension degree k 


12F - 1 


-1±61 


3 


F + l + 1 


— / + 1 


4 


aF -\- 1 


1±2Z 


6 


V(j 


t > 3 


k > , ^ 

— loff(t-l) 



3.1 New Explicit Conditions Vulnerable against FR-Reduction 

In this section, we investigate new conditions of which ECDLP on E/¥g is re- 
duced to DLP on seriously low extension field like F^a, F^4, and F^e, which just 
occurs in the case of supersingular elliptic curves. Supersingular elliptic curves 
have rather special properties compared with ordinary elliptic curves ([35]), which 
would no doubt cause such vulnerable factor. Here we show that there exist also 
vulnerable conditions of traces in the case of ordinary elliptic curves. 

Let E/¥g be an elliptic curve with order n = ffE{¥g) = q + 1 — t, where 
t is the trace of E. Then we show the conditions of which ECDLP on E/¥g is 
reduced to DLP on F *3 by FR-reduction. 

Theorem 2. Let E/¥g be an elliptic curve with prime order n {q > 64). ECDLP 
on E/¥q is reduced to DLP on F *3 by FR-reduction if and only if one of the fol- 
lowing conditions holds, 
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(i) {q,t) can he represented by q = 121'^ — 1 and t = —1 ±61 (? G Z). 

(a) (q,t) can be represented by q = p^ (r is even) and t = ±y/q (i.e. super- 
singular elliptic curves). 

proof: We assume that ECDLP on E/¥q with prime order n is reduced to DLP 
on F *3 by FR-reduction. From the condition of FR-reduction, n satisfies that 

n\q^ — 1 and n fq—l since n is a prime. Therefore there is an integer A such that 



q'^ + q+l = Xn. By setting n = q + 1 — t and q'^ + q+l = {q + l)'^ — t'^ + t'^ — q, 
we get the following equation, 

(< 7+1 — t){q + l + t — X) = q — . (1) 

By Basse’s Theorem, the trace t satisfies |t| < 2^. Hence, (1) satisfies 

— 3 + (1 H ^^{q + 1 + t — A) + 1. (2) 

For the assumption of < 7 , t G Z and q > 64, we conclude that {q, t) satisfies one 
of the following equations, 

g+l + t-A = -3,-2,-l,0,l (3) 

By substituting (3) to (1), we get that (q,t) satisfies the following equations, 

f + 3t - 4q - 3 = 0, (4) 

f + 2t-3q-2 = 0, ( 5 ) 

t^ + t-2q-l = 0, ( 6 ) 

t^-q = 0, ( 7 ) 

f-t+l = 0. ( 8 ) 



By simple discussion on the existence of integer solutions for congruence equa- 
tions, we get that (t, g) G Z x Z exists if and only if {t, q) satisfies (5) or (7). 

In the case of (5), {t, q) is expressed by t = —1 + 6/ and q = 12P — 1 for / G Z 
since q = p'" for a prime p, and t G Z satisfies 

t = -l±^3{q+l). 

In the case of (7), {t,q) is expressed by / = ±^/q = for even integers r. 

This is just a supersingular elliptic curve. 

Conversely, if a prime-order elliptic curve E/¥q satisfies (i) or (ii) in Theo- 
rem 2, then )fE{¥q) = n satisfies n|g^ — 1. Therefore ECDLP on E/¥q is reduced 
to DLP on F*3 . I 

Note that possible order of elliptic curves is given by Deuring([9]) and 
Waterhouse([17]). In the case of F/Fp, there exactly exists an elliptic curve 
of type (i) in Theorem 2. In the case of F 2 »-, there does not exist any elliptic 
curve of type (i) in Theorem 2, but in the case of Fpr(p > 3) there exists. 

We get the next corollary easily from Theorem 2. 

Corollary 1 Let E/¥q he an elliptic curve with trace t. If (q,t) can be repre- 
sented by q = I 2 P — 1 and t = —1 ±6/(/ G Z), then ECDLP on F(Fg) is reduced 
to DLP on F *3 by FR-reduction. 
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proof: Here we set ffE{¥q) = n and let order of G G E{¥q) be m. Then m divides 
n. From the assumption, n = 12P ± 6Z + 1. This yields 12P = ±61 — 1 (mod n). 
Then by using the relation of both 12P = ±6/ — 1 (mod n) and q = 12P — 1, 
we get 

- 1 = (12^2 _ 2)((12/2 - 1)2 + 12;2) 

= (12F - 2)((±6; - 2)2 + {±61 - 1)) (mod n) 

= {12P — 2)(36P =F 181 ± 3) (mod n) 

= 0 (mod n) 

= 0 (mod m). 

Therefore ECDLP on V < G >C E{¥q) is reduced to DTP on F*3 by FR- 
reduction. I 

Next we show the conditions of which ECDLP on E/¥q is reduced to DLP 
on F*4 by FR-reduction. 

Theorem 3. LetE/¥q he an elliptic curve with prime order n {q > 36). ECDLP 
on E/¥q is reduced to DLP on F*4 by FR-reduction if and only if one of the 
following conditions holds, 

(i) {q, t) can be represented by q = P ± I ± 1 and t = —I, I ± 1 for I G Z. 

(ii) {q, f) can he represented by q = 2'^ (r is odd) and t = ±\/2g (i.e. supersingular 
elliptic curves). 

proof: We assume that ECDLP on E/¥q with prime order n is reduced to DLP 
on F*4 by FR-reduction. From the condition of FR-reduction, n satisfies that 
n\q^ — 1 and n ){q^ — 1 since n is a prime. Therefore there is an integer A such 
that (f' ±1 = An. In the same way as Theorem 2, we get the following equation, 

(9 -l- 1 — t){q ± 1 ± t — A) = 2q — t^ . (9) 

From Basse’s Theorem, (9) satisfies that 

— 2 ^ (1 H ^)(? + 1 + t — A) <2. (10) 

q q 

In the same discussion as Theorem 2, we get that {t,q) G Z x Z exists if and 
only if {t, q) satisfies 

2(7 = 0, (11) 
- t-q±l = Q. (12) 

In the case of (11), t satisfies t = ±^/2q = ±^/2jf for p = 2 and an odd positive 
integer r. This is just a supersingular elliptic curve. In the case of (12), {t,q) is 
expressed by t = —I, I ± 1 and q = P ± I ± 1 for I G Z since t G Z satisfies 

l±V4g^ 

2 

Apparently if a prime-order elliptic curve E/¥q satisfies (i) or (ii) in Theorem 3, 
then ECDLP on E/¥q is reduced to DLP on F*4. I 

The next corollary follows from Theorem 3. 
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Corollary 2 Let E/¥q be an elliptic curve with trace t. If (q,t) can he repre- 
sented by q = P 1 1 and t = —I, I + 1 for I G Z, then ECDLP on E{¥q) is 

reduced to DIP on F *4 by FR-reduction. 

In the same way as Theorems 2 and 3, the explicit conditions of which 
ECDLP on E/¥q is reduced to DLP on F*e by FR-reduction are shown as 
follows. 

Theorem 4. Let E/¥q be an elliptic curve with prime order n. ECDLP on E/¥q 
is reduced to DLP on F*e by FR-reduction if and only if one of the following 
conditions holds, 

(i) {q, t) can be represented by q = -\- 1 and t = 1 ±21 for I € Z. 

(ii) (q,t) can be represented by q = S'" and t = ±-\/3q for an odd integer r (i.e. 
supersingular elliptic curve). 



Corollary 3 Let E/¥q he an elliptic curve with trace t. If (q,t) can be repre- 
sented by q = + 1 and t = l±2l for I G Z, then ECDLP on E/¥q is reduced 

to DLP on F*6 by FR-reduction. 



Remark 1 Theorems 2, 3, and 4 use the fact that the k-th cyclotomic polyno- 
mial is decomposed into at most 2-degree irreducible polynomials over Z in the 
case of k = 3, 4, and 6, respectively. For other cases of k, the same discussion 
might he used if the k-th cyclotomic polynomial is decomposed into irreducible 
polynomials with rather small degrees over Z. 



3.2 New Explicit Conditions Secure against FR-Reduction 

In this section, from a secure point of view we investigate a new explicit condition 
of elliptic curve traces on which the reduced extension degree is always higher 
than a certain level. As for the known results on E/¥p, dangerous elliptic curves 
happen to be small traces like 0, 1 and 2. However, on the contrary, our results 
of Theorems 2, 3 and 4 suggest that the elliptic curve trace whose order is near 
upper bound in Basse’s Theorem([35]) should be vulnerable. As a result, we 
show that the extension degree is higher than a certain level when the positive 
trace except for t = 0, 1 and 2 is small enough. 

Theorem 5. Let E/¥q he an elliptic curve with prime order n (q > 861^, 
ECDLP on E(¥q) be reduced to DLP on F*^,, and t be the elliptic curve trace. If 
t > 3, then the extension degree k satisfies 



k > 



logg 

log (t - 1) 



- e, 



where e is a real number such that A > ^ > 0. 

proof: ECDLP on E{¥q) is reduced to DLP on F^k if and only if 



= 1 (mod n). 



(13) 
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By substituting n = q+ l — tio (13), we get that k is the smallest integer 
satisfying 

(t-l)'= = l (modn). (14) 

From the assumption and Basse’s theorem, t satisfies 3 < t < 2^ ^ q k, ri. 
Therefore 



1 < (t — 1)^ < n < n + 1 



if 1 < fc < Then it follows that the smallest integer k such that (t — 

1)^ = 1 (mod n) is greater than or equal to Furthermore by substituting 

n = q + 1 — t, we get that 



k > 



logg 

log (t - 1) 



where e = — logj _2 (1 — ^) - By using the relation of 3 < t < 2y/q, we get easily 
that 

2 11 

0 < £ < -logt_i (!- — + -)<—, 

y/q q 10 

ii q > 861. Apparently the larger q is, the smaller £ is. Thus the lower bound of 
extension degree is given by 



k > 



logg 

log (t - 1) 



I 

The above theorem gives a lower bound of extension degree k in the case of small 
t > 3, which ensures the security of ECDLP over E/¥g by that of widely known 
DTP on F*fc . 

The next corollary easily follows from Theorem 5. 

Corollary 4 Let E/¥g be an prime order elliptic curve with t = 3 {q > 861) 
and ECDLP on E (Fg) he reduced to DLP on F*^ . Then the extension degree k 
satisfies 

k > logq — e, 

where e is a real number such that A > £ > 0. 



Remark 2 The extension degree k < logg means that FR-reduction 
gives a subexponential attack against ECDLP under the index calculus 
method([8]), which runs over any field F^ in time Lg[l/2, c] = exp((c + 
0(l))(logg)^/^(loglogg)^/^). On the other hand, the extension degree k < 
(logg)^ means that FR-reduction gives a subexponential attack against ECDLP 
under the number field sieve([ 14 ]) which runs over some fields F, in time 
L,[l/3,c] = exp((c + 0(l))(logg)^/^(loglogg)^/^). Therefore in order to con- 
struct enough secure elliptic curve cryptosystems it would he desirable that 
k > (logg)^. However the condition of k > logq in Corollary 4 is not highly 
optimistic if we estimate under a rather realistic assumption of the discrete log- 
arithm algorithm for definition fields of elliptic curves ([29,8]). 
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In the case of prime-order elliptic curves E/Vp with t = 3, we will easily see that 
the following strict condition also holds: the extension degree is just exponential. 

Corollary 5 Let E/¥p he a prime-order elliptic curve with t = 3 (i.e. ^E{¥p) = 
p — 2 is prime). If 2 is a primitive root in Fp_ 2 , then the extension degree k such 
that ECDLP on E{¥p) is reduced to DIP on F** satisfies k = p — 3. 

4 Algorithm 

In this section, we describe algorithms to construct elliptic curves vulnerable or 
secure against FR-reduction in Section 3 and confirm that such elliptic curves 
exist in a realistic sense (i.e. constructable). From the point of view of theoretical 
interest, each construction is deeply related to each famous number theory prob- 
lem: the former is a problem of finding integer solutions of Pell’s equation([16]), 
and the latter is a problem of finding twin prime numbers. 



4.1 Construction of Elliptic Curves Reducible to Lower Extension 
Degree 

Here we present an algorithm to construct elliptic curves over Fp in Corollary 1 
since Theorem 2 is a special case of Corollary 1. By using the CM-method([2])^, 
the dominant step of construction of elliptic curves with both p = 121“^ — 1 and 
t = — 1 ± 6l{l G Z) is finding integer solutions (l,y) of 121^ ± 12^ — 5 = dy^ for 
a given positive integer d = 3 (mod 4), which is easily transformed into finding 
integer solutions of an indeterminate equation 

- 3dy'^ = 24. (15) 

From the elementary number theory([37]), all integer solutions (x,y) of (15) is 
given by 



X yV^ = {x\ + yiVM){to UqV^)”, 

where (to,wo) is the minimum positive integer solution on e = to + > 0 

of Pell’s equation, 

- 3dU^ = 1, (16) 

and (xi, yi) is an integer solution of (15) in the following domain Dom, 

Dom = {(a:,y)|%/24 < x < 0 < a: < uoV^}- 

^ The procedure of the CM-method includes a step of computing the Hilbert class 
polynomials ([23]), Pd{x). The computation of the Hilbert class polynomials are not 
so easy if the degree of the Hilbert class polynomial, deg{Pd{x)), namely the class 
number is large. Therefore we usually fix d and so Pdix) beforehand in order to 
avoid the computation of Pd(x) as we will see in Algorithm 2. In another way, we 
may make use of the recent researches([5,6]) on the construction of the CM elliptic 
curves by both the CM tests and liftings instead of the CM-method. 
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Here we call two integer solutions (x,y) and {x',y') of (15) are associated if 

X + yVM = ±(x' + y'V3d)(to + 

for 3n G {0, ±1, ±2, • • • }. After finding an integer solution (x,y) of (15) in the 
above procedure, the construction of elliptic curves if /Fp with the trace t easily 
follows the CM-method. In order to find integer solutions efficiently, we need 
some techniques specific to (15). Here we show only specific techniques, all of 
which are proved by simple discussion on the existence of integer solutions for 
congruence equations. 

Lemma 1. If there exists an integer solution {I, y) of 12P ± 12^ — 5 = dy^ , then 
d = 19 (mod 24) . 

proof: From dy'^ = 12^^ ± 12^ — 5 = 12l{l ± 1) — 5 = 19 (mod 24), we get dy^ = 19 
(mod 24). By using the fact of y"^ = 0, 1, 4, 9, 12, 16 (mod 24), we get that d = 19 
(mod 24) if there exists an integer solution of dy^ = 19 (mod 24). I 



Lemma 2. Let d € Z 6e d = 19 (mod 24). If there exists an integer solution 
{xo,yo) of (15), then gcd{xo,yo) = 1. 

proof: Let (x, y) be an integer solution of (15) and gcd(x, y) = g > 1. Then g = 2 
since (/^|24. So we can set x = 2x' and y = 2y' {x',y' G Z) with gcd(x',y') = 
1. From the assumption of d = 19 (mod 24), {x' ,y') satisfies = 6 

(mod 12). This is contradictory because there does not exist any integer solution 
(x, y) of x^ + = 6 (mod 12). I 



Corollary 6 Let d G Z 6e d = 19 (mod 24). If there exists an integer solution 
(xo,2/o) of (15), then both xq and yo are odd. 

proof: This follows from Lemma 2. I 



Lemma 3. Let d G Z &e d = 19 (mod 24) and (xo,yo) be a set of integer 
solutions of (15). Then both (xo,yo) ond (xq,— yo) ore not associated. 

proof: Two solutions (x, y) and (x', y') of (15) are associated if and only if xy' — 
x'y = 0 (mod 24) (see Section 34 in [37]). Therefore if both (xq, yo) and (xq, —yo) 
are associated, then 2xoyo = 0 (mod 24). This is contradictory to Corollary 6. I 



Lemma 4. Let d G Z 6e d = 19 (mod 24). Then there are at most two integer 
solutions in Dom for (15). 

proof: From Lemma 2, there exist an integer solution s satisfying the following 
conditions: 

12d = — 96 to, gcd(24, s, to) = 1, = 12d (mod 96), and —24 < s < 24, 

if there exist an integer solution (x,y) in Dom for (15)(see Section 35 in [37]). 
From the simple discussion on the existence of integer solutions for congruence 
equations, there are at most two integer solutions s satisfying the above condi- 
tions. Therefore there are at most two integer solutions in Dom for (15). I 
The next proposition follows from Lemmas 3 and 4. 
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Proposition 1 Let d £ Z be d = 19 (mod 24). Then there exist just two sets of 
integer solutions in Dom for (15) if there exist. 

Here we give the algorithm as follows: 

Algorithm 1 Given the upper bound UP > 0 on a prime p, this 

algorithm outputs (p,d,l), or fail if such a (p,d,l) does not exist. 

1. Choose a positive integer d such that d = 19 (mod 24) . 

2. Find the minimum positive integer solution (to, Mo) (16). 

3. Find an integer solution (x,y) £ Dom of (15), if exists. 
Otherwise, output fail and terminate the algorithm. 

4. For n > 1, set Xn, Un in such a way that 

Xn + Vn''/^ '■= {x + yV^)(to + uo'/Sd)"' ■ 

5. Set li^„ := (x„ - 3)/6, h.n '■= (x„ + 3)/6, := 12tJ „ - 1 , and 

P2,n — 12^2, n ^ * 

6. If Pin > UP and p 2 ,n > UP, then output fail and terminate the 
algorithm. 

7. If or p 2 .n is prime, then output (pi_„, d, ti,„) or (pi,„, d, ^ 2 .n) 

respectively, and terminate the algorithm. Otherwise goto 4. 



4.2 Construction of Elliptic Curves Reducible to Higher Extension 
Degree 

Here we present an algorithm to construct elliptic curves E/¥p with t = 3 in 
Corollary 4, in which the CM-method is also used in the same way as Section 4.1. 
By using the CM-method, the dominant steps of construction of prime-order 
elliptic curves E/¥p with t = 3, namely ffEfWp) = p — 2, are finding a prime 
number p = dP + dl+ with Z G Z for an given positive integer d = 3 (mod 4), 
and checking p — 2 is also prime. 

In this case we can easily show the following condition of d. 

Lemma 5. Let p £ Z be p = dP + dl + with a positive integer d = 3 

(mod 4). Lf both p and p — 2 are prime, then d = 19 (mod 24). 

proof: For the assumption of d = 3 (mod 4), we set d = 3 -F 4m (m G Z). Then 

,,2 ,, d -|- 9 

p = dl -\- dl -\- — - — 

4 

= dl{l -F 1) -F (m -F 3) (17) 

= m-Fl (mod 2). (18) 

Since p is prime, m = 0 (mod 2) from (18). So we can set d = 3-F8m' (3m' G Z). 
On the other hand, we get p = 1 (mod 6) since both p and p — 2 are prime and 
also get easily l{l -F 1) = 0, 2 (mod 6) for V( G Z. If l{l -F 1) = 0 (mod 6), then 
m' = 2 (mod 3) from (17). This yields d = 19 (mod 24). If l{l + l) = 2 (mod 6), 
then this yields contradictory. In this way we get d = 19 (mod 24). I 

Here we give the algorithm as follows: 

Algorithm 2 Given the upper bound UP > 0 on a prime p, this 

algorithm outputs the prime-order elliptic curve E/¥p with t = 3, 
or fail if such an E/¥p does not exist. 
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1. Choose a positive integer d such that d = 19 (mod 24) . 

2 . Set p = dl{l + 1) + for Z 9 Z > 0 such that I = 0,2 (mod 3) . 

3. If p>UP, then output fail and terminate the algorithm. 
Otherwise goto step 4. 

4. If both p and p — 2 are prime, then goto step 5. Otherwise goto 
step 2 cuid try the next I . 

5. Compute the Hilbert class polynomial Pj^(x) . 

6. Solve a root jo of Pd(x) = 0 (mod p) . 

7. Construct two elliptic curves and , 

Ejo :y'^ = + aj^x + bj^ , E'^^ : y'^ = x^ + aj^c^x + 

where (inod p) , bj„ = (mod p) , and c is any 

quadratic non-residue in Fp . 

8. Output E G {Ejg,E'j^'\ with ^E{¥p) = p — 2 and terminate the 

algorithm. 

Note that the step 8 can be performed easily: output E such that (p — 2)G = O 
for £;(Fp) b3G^O. 

5 Experimental Results 

In this section, we present some examples in both vulnerable and secure cases. 



5.1 Elliptic Curves Reducible to Lower Extension Degree 

We present one example which satisfies the condition of Corollary 1. We searched 
elliptic curves E/¥p in the range of 0 < p < 2^°°° by using Algorithm 1. Our 
modulo arithmetic uses the GNU MP Library GMP([38]). The platform is an Al- 
pha 21264(500 MHz/C Compiler for Digital UNIX). It took on the average 0.101 
sec to find an elliptic curve E/¥p in the case of d = 19. We have also confirmed 
experimentally that vulnerable elliptic curves with new explicit conditions are 
constructable systematically in the same way as supersingular or trace 2 elliptic 
curves. This means that even in the case of ordinary elliptic curves, we must 
check FR-conditions. Recently some researches([21,22]) on a protocol using an 
elliptic curve E/¥p with the computable FR-reduction have been proposed, in 
which an elliptic curve E/¥p reduced to F^fe with the computable lower exten- 
sion degree is desired. Our approach is also deeply related to their researches. 



Example 1 

E/¥p : + ax + b 

p = 9 08761 00379 04279 08077 54895 57583 80356 67582 90265 31247 (170-bit), 

a = 8 18416 34259 48882 91485 04408 88116 40789 05308 57899 75506, 

6 = 6 66070 44332 39783 49780 03588 18034 13282 86571 48420 57992, 

t = -5 22138 20118 54029 93899 01413, 

)fE{¥p) = 7^ ^ 313 * TL, 

n = 59 25285 28258 73893 72612 30363 15589 78126 20544 05453 (156-bit). 
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5.2 Elliptic Curves Reducible to Higher Extension Degree 

We present experimental results and some examples of elliptic curves in Corol- 
laries 4 and 5. We have confirmed that secure elliptic curves with new explicit 
conditions are constructible systematically. Table 3 shows numerical results of 
twin primes {p,p — 2) with p = dP + dl + which was searched in the range 
of 2^® — 2^® < I < 2^® -I- 2^®. Our modulo arithmetic uses the GNU MP Library 
GMP([38]). The platform is an Alpha 21264(500 MHz/C Compiler for Digital 
UNIX). It took on the average 0.053 sec to find a pair of {p,p— 2) in the case of 
d = 163. For other cases of d, we could find such a pair of primes on the average 
0.064 ~ 1.402 sec. Fig.l shows the plot of Table 3 from the point of view of 
deg{Pd{x)) and the size of d on Pd{x). From our experimental result, we have 
found a heuristic property that the number of twin primes are closely related to 
two factors, deg(Pd(a;)) and the size of d on Pd{x). If we fix the size of d, then 
the larger deg(Pd(a;)) is, the less twin primes are found. If we fix deg(Pd(x)), 
then the larger the size of d is, the more twin primes are found. 



Table 3. The number of twin primes {p,p — 2) 



d deg{Pd{x)) 


^ twin primes 


times (sec) 


19 


1 


190 


0.550 


43 


1 


1,157 


0.094 


67 


1 


1,902 


0.064 


91 


2 


450 


0.365 


115 


2 


1,036 


0.209 


139 


3 


139 


0.323 


163 


1 


5,158 


0.053 


187 


2 


1,402 


0.107 


211 


3 


292 


1.401 


235 


2 


2,523 


0.089 


259 


4 


247 


0.348 


283 


3 


645 


0.234 


307 


3 


696 


0.134 


331 


3 


1,458 


0.103 


355 


4 


635 


0.261 


379 


3 


1,583 


0.074 


403 


2 


3,392 


0.069 




P 


= dP + dl+’^ (2'^® 


- 2^® < Z < 2'^® -h 220) 



To make a comparison to RSA key generation, we have compared twin-prime- 
generation times and RSA-prime-generation times. From the point of view of the 
same security level, we consider the following three conditions of bit size on (el- 
liptic curve cryptosystem, RSA): (160, 1024), (224, 2,048) and (256, 3,072)([33]). 
Table 4 shows both of twin-prime-generation times and RSA-prime-generation 
times, where the size of RSA-prime is just half size of the above security level. As 
for the twin-prime generation, we dealt with four cases of d = 163, 427, 907, 1555 
that correspond to deg(Pd(a^))=l, 2, 3, 4 respectively. These characters are also 
used in Table 4 and Fig 2. We searched for 1,000 twin primes by Algorithm 2 
and computed the average times. As for the RSA-prime generation, we searched 
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Fig. 1. Relations between ^ twin primes and Pd(x) 



for 1,000 RSA primes by simply performing a primality test among odd num- 
bers, and computed the average times. The platform is also an Alpha 21264 (500 
MHz / C compiler for Digital UNIX). For the primality test, we made use of 
Miller-Rabin’s probablistic test in GNU MP Library GMP. Fig 2 shows the plot 
of Table 4. Note that the vertical axis is represented in logarithm. We can easily 
see that the generation of twin primes is faster than that of RSA primes in any 
case. 



Table 4. Times of twin-prime generation and RSA-prime generation (sec) 



bit size (twin primes, RSA) 


(160, 1024) 


(224, 2048) 


(256, 3072) 


RSA 


0.098 


0.826 


16.274 




1 


0.047 


0.130 


0.242 


Twin primes 


2 


0.058 


0.164 


0.265 




3 


0.057 


0.274 


0.401 




4 


0.057 


0.175 


0.272 



We present E/¥p : = x^ + ax+b with t = 3 in the following. In Examples 2 

~ 4, 2 is a primitive root in Fp_ 2 . 

Example 2 

-Ei/Fp : + a\x + h\, i?2/Fp : = x^ + a^x + & 2 , (|p| = 159 — bit) 

p = 519 51816 01449 69382 38659 23754 49686 02163 04833 66071, 

n = 519 51816 01449 69382 38659 23754 49686 02163 04833 66069, 

ai = 35 29380 82819 03345 16798 59515 21747 57876 817006 32697, 

bi = 408 46477 52610 12095 24877 04686 28212 53233 12948 77155, 

02 = 43 94541 02577 39111 90178 78324 59422 25137 69507 32067, 

62 = 375 64238 02684 72329 52558 68052 72738 84867 16227 32092. 
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Fig. 2. Times of twin-prime generation and RSA-prime generation 



Example 3 

Ei/Fp ■. + aix + b\, i?2/Fp : = x^ + U 2 X + 62, -Es/Fp : y^ = x^ + a^x -I- 63, 

(IpI = 159 — bit) 

p = 793 54971 71445 13671 92705 06772 26939 83458 80422 30471, 
n = 793 54971 71445 13671 92705 06772 26939 83458 80422 30469, 

ai = 622 32433 75781 36504 38145 80347 56708 57012 73203 93428, 
bi = 679 39946 41002 62226 89665 55822 46785 65828 08943 39109, 

02 = 546 59131 03249 88457 46494 19390 10636 40227 07442 50852, 

62 = 364 39420 68833 25638 30996 12926 73757 60151 38295 00568, 

03 = 261 88075 85593 34219 51163 09691 46231 55329 60288 84192, 

63 = 179 85880 00172 30155 26919 24926 22984 48533 06563 08058. 



Example 4 

E/¥p : y^ = x^ + ax + b, (|p| = 240 — bit) 

p = 112 49846 54526 86189 73518 65205 55113 42541 99281 27068 83806 23265 
87119 55023 07023, 

n = 112 49846 54526 86189 73518 65205 55113 42541 99281 27068 83806 23265 
87119 55023 07021, 

o = 52 37381 80880 77183 56601 62811 25609 08710 91667 71974 15904 90057 
09224 69377 60775, 

6 = 34 91587 87253 84789 04401 08540 83739 39140 61111 81316 10603 26704 
72816 46251 73850. 



Example 5 

Ei/Wp ■. y^ = x^ + aix + bi, E 2 /Fp : y^ = x^ + 02 X + 62, Es/Fp : y^ = x^ + aax + 63 , 
(IpI = 240 — bit) 
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p = 145 62684 79172 80895 91487 33486 94032 72646 08218 46342 12380 03553 
12226 43548 52871, 

n = 145 62684 79172 80895 91487 33486 94032 72646 08218 46342 12380 03553 
12226 43548 52869, 

ai = 144 44371 02824 33267 37769 11780 11326 91187 09134 83450 79361 18648 
91066 43377 85210, 

61 = 50 11979 94855 57136 68786 73438 08285 32827 34850 99302 48151 81056 
65622 14743 74505, 

02 = 26 77304 81723 26198 90654 78404 65044 67257 17139 39775 54321 43896 
98924 70624 48137, 

62 = 66 39098 14206 44431 24265 63432 08040 69053 47499 08631 07007 63782 
36691 94932 49715, 

03 = 47 49197 80769 28734 86477 41659 37707 95433 64827 81423 90680 35668 
50843 51479 03933, 

63 = 31 66131 87179 52489 90984 94439 58471 96955 76551 87615 93786 90445 
67229 00986 02622. 



6 Conclusion 

In this paper, we have shown some new explicit conditions of elliptic curve traces 
vulnerable or secure against FR-reduction. We have also presented algorithms 
to construct elliptic curves with our new explicit conditions. Especially our new 
secure elliptic curve realizes rather light initialization, which sets up a pair of 
elliptic curve and basepoint. 

Acknowledgments. The authors are grateful to anonymous referees for invalu- 
able comments. 
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Abstract. In this paper we consider the optimistic approach of the 
non-repudiation protocols. We study a non-repudiation protocol with 
off-line trusted third party and we keep on with the definition of the 
multi-party non-repudiation, compare it to multi-party fair exchange and 
show some fundamental differences between these two problems. Finally, 
we generalize our protocol and propose a multi-party non-repudiation 
protocol with off-line trusted third party. 



1 Introduction 

The impressive growth of open networks during the last decade has given more 
importance to several security related problems. The non-repudiation problem 
is one of them. Non-repudiation services must ensure that when Alice sends 
some information to Bob over a network, neither Alice nor Bob can deny having 
participated in a part or the whole of this communication. Therefore a non- 
repudiation protocol has to generate non-repudiation of origin evidences intended 
to Bob, and non-repudiation of receipt evidences destined to Alice. In case of a 
dispute (e.g. Alice denying having sent a given message or Bob denying having 
received it) an adjudicator can evaluate these evidences and take a decision in 
favor of one of the parties without any ambiguity. 

In comparison to other security issues, such as privacy or authenticity of com- 
munications, non-repudiation has not been studied intensively. However many 
applications such as electronic commerce, fair exchange, certified electronic mail, 
etc. are related to non-repudiation. Non-repudiation of origin can easily be pro- 
vided by signing the sent information. A digital signature provides an irrefutable 
non-repudiation of origin evidence: a digital signature creates a link between a 
message and a public verification key. A certification authority assures the cor- 
respondance between this public signature verification key and an identity. Non- 
repudiation of receipt is more difficult to achieve: therefore Alice and Bob have 
to follow a protocol that assures both services. Such a protocol is said to be fair 
if either Alice receives a non-repudiation of receipt evidence and Bob receives a 
non-repudiation of origin evidence or none of them obtains any valid evidence. 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 109-122, 2001. 
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First solutions to those problems involve a trusted third party (TTP) that 
acts as an intermediary between the participating entities. The major disad- 
vantage of this approach is the communication bottleneck created at the TTP. 
Therefore more efficient solutions have been proposed. Two different approaches 
have been considered: one consists in designing protocols without a TTP, the 
other tries to minimize its involvement. 

The approach without TTP involvement is often based on a gradual release 
of the knowledge. However it generally requires that all involved parties have the 
same computational power. Another disadvantage is the important number of 
transmitted messages. In [6] a protocol for digital contract signing without TTP 
is proposed: the probability that the contract has been signed, is increased each 
round until reaching one. The assumption of same computational power is not 
needed. Another recently presented probabilistic non-repudiation protocol [13], 
also succeeded in relaxing the condition on the computational power. Here the 
idea is that the recipient does not know a priori which transmission will contain 
the message. The probability of guessing the transmission including the message 
is arbitrarily small. However to decrease the probability the number of messages 
has to be increased. 

The other approach, trying to minimize the TTP involvement has got more 
attention in literature during the last years. Asokan et al. presented, in the 
context of fair exchange, the optimistic approach [4, 1]: usually all participants 
are honest and only in the case of a misbehaving party, the TTP has to be 
involved. The protocols inspired by this approach are said to be protocols using 
an off-line TTP. 

The most complete non-repudiation protocols with off-line TTP have been 
presented in [16] and independently in [12]. The optimistic approach assumes 
that in general Alice and Bob are honest, i.e. they correctly follow the protocol, 
and the TTP only intervenes, by the mean of a recovery protocol, when a problem 
arises. 

Most of the proposed non-repudiation protocols are two-party protocols. In 
fair exchange first works have been done to generalize them to the case of n 
participants [3, 2,9,5]. Considering non-repudiation protocols, the only to us 
known work generalizing these protocols has been presented in [11] and concerns 
a non-repudiation protocol with on-line TTP (which intervenes in each session 
of the protocol, even if the parties behave correctly). 

The protocol presented here considers an off-line TTP. To the best of our 
knowledge, the only comparable work is the multi-party certified mail protocol 
proposed by Asokan et al. in [2]. However the here proposed protocol is more 
general: as it will be outlined later in more details, the certified mail protocol 
only continues if the whole set of receivers is willing to do so. Our protocol leaves 
the choice to the sender to finish with only the subset of the responding receivers, 
or to stop in the case of a non-responding receiver, as the certified mail protocol 
does. 

We first remind the properties required by non-repudiation protocols. Then, 
we go on presenting a two-party non-repudiation protocol with off-line TTP. 
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Afterwards, we define the multi-party non-repudiation problem, showing some 
differences with multi-party fair exchange. The requirements of a multi-party 
non-repudiation protocol are defined as well. Finally, we present a generalization 
of the previously presented two-party non-repudiation protocol to the case of n 
parties, using a group encryption scheme. 



2 Properties 

2.1 The Communication Channels 

In the framework of such exchange protocols, we can distinguish three classes of 
communication channels: unreliable channels, resilient channels and operational 
channels. No assumptions have to be made about unreliable channels: data may 
be lost. A resilient channel delivers data after a finite, but unknown amount of 
time. Data may be delayed, but will eventually arrive. When using an opera- 
tional channel data arrive after a known, constant amount of time. Operational 
channels are however rather unrealistic in heterogenous networks. 



2.2 Requirements on Non-repudiation Protocols 

A first property non-repudiation protocols must provide is fairness. One can dis- 
tinguish between the two notions of strong fairness and weak fairness. 

A non-repudiation protocol is said to provide strong fairness if at the end of the 
protocol Alice has got a complete non-repudiation of receipt evidence if and only 
if Bob has got the message with a complete corresponding non-repudiation of 
origin evidence. 

A non-repudiation protocol provides weak fairness if either the protocol pro- 
vides strong fairness, or the protocol provides evidences that can prove to an 
adjudicator until which state the protocol has been executed. 

A second property we require is timeliness: a protocol must be finished after 
a finite amount of time for each participating entity that is behaving correctly 
with respect to the protocol. 

Each protocol must ensure both properties to be acceptable. 

3 A Two-Party Non-repudiation Protocol with Off-line 
TTP 

3.1 Introduction 

We will now present a two-party non-repudiation protocol with an off-line TTP 
[12]. In this protocol, Alice wants to exchange a message m and its corresponding 
non-repudiation of origin evidence against a non-repudiation of receipt evidence, 
issued by Bob. 

This protocol results from modifications made on the Zhou-Gollman opti- 
mistic protocol [18] in which an operational channel is needed between the TTP 
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and Alice in order to assure fairness. The here presented protocol only needs a 
resilient channel between the TTP and respectively Alice and Bob. The chan- 
nel between Alice and Bob may even be unreliable. The protocol is similar to 
the independently developed autonomous two-party non-repudiation protocol 
proposed earlier by Zhou et al. in [16]. 

The protocol is composed of three sub-protocols: the main protocol, the re- 
covery protocol and the abort protocol. The main protocol consists of messages 
exchanged directly between Alice and Bob. In case of problems during this main 
protocol, two (mutually exclusive) possibilities are offered to the entities. Either 
Alice contacts the TTP to abort the protocol in order to cancel the exchange, 
or Alice or Bob contacts the TTP to launch the recovery protocol in order to 
complete the exchange. 



3.2 Notations 

We use the following notation to describe the protocol: 

— X ^Y'. transmission from entity X to entity Y 

— X ^ y : ftp get operation performed by X at F 

— h{)\ a collision resistant one-way hash function 

— Ek{)'. a symmetric-key encryption function under key k 

— Dk{)' a symmetric-key decryption function under key k 

— Sx 0 : the signature function of entity X 

— rn: the message sent from A to B 

— k: the message key A uses to cipher m 

— c = Ek (m) : the cipher of m under the key k 

— I = h{m,k): a label that in conjunction with (A,B) uniquely identifies a 
protocol run 

— /: a flag indicating the purpose of a message 

— EOO^ SA{fEOO,B,l,h{c)) 

— EOR = SbUeor, A, I, h{c)) 

— Suh ~ SAifsub, B, I, ETTp{k)) 

— EOOk = SAifEOOkT B,l,k) 

— EORk = SsifEORk: A,l,k) 

— Recx = Sx{fRec^,Y,l) 

— Cor\k = STTpifconk, A, B,l,k) 

— Abort = Sxif Abort, B, 1) 

— Coria = 5 'ttp(/cop„, A, B, I) 

3.3 Main Protocol 

The basic idea of the main protocol is to first exchange the cipher of the mes- 
sage m against a receipt for this cipher. Secondly, we exchange the decryption 
key against a receipt for this key. Each transmission is associated to some max- 
imum time-out. Once this time-out value is exceeded the recipient supposes the 
transmission will not arrive and initiates either a recovery or an abort protocol. 
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1. 


A- 


-> B 


2. 


B - 


-4 A 


3. 


A- 


^ B 


4. 


B ~ 


-4 A 



/eoo, /sub, B, I, c, ETTp{k), EOO, Sub 
/eor,^ 5 ^,EOR (time-out: abort) 

/EOOfc) -B, ^ fc, EOOfc (time-out: recovery [X := B,Y := A]) 
/EORfe)^ 5 ^, EORfe (time-out: recovery[X := A,Y := B]) 



Alice starts the protocol by sending the cipher of the message, as well as the 
decryption key, ciphered under the public key of the TTP, to Bob. The message 
does also contain Alice’s signature on the encrypted key and the hash of the 
cipher. These signatures serve as evidences of origin for the ciphers. In this first 
message we use two purpose flags as it conveis two proofs (EOO and Sub). 

If Bob receives the first message he replies with a receipt to confirm the 
arrival of the first message. This receipt contains Bob’s signature on the hash of 
the cipher c and serves to Alice as an evidence of receipt for the cipher. 

In the case that Alice does not receive message 2 from Bob before a given 
time-out, she initiates the abort protocol. Note that Alice cannot perform a 
recovery at this moment, as the recovery protocol requires EOR, the evidence of 
receipt for the cipher. 

If message 2 arrives to Alice before the time-out, she sends to Bob the de- 
cryption key k, as well as her signature on this key. This signature is used as an 
evidence of origin for the key. The evidence of origin for the cipher c, together 
with the evidence of origin for the key k, form together the non-repudiation of 
origin evidence of the message m. 

Message 3 has to arrive to Bob before a given time-out. Otherwise Bob ini- 
tiates the recovery protocol with the TTP. 

If message 3 arrives in time. Bob sends a receipt for the key to Alice: his 
signature on the key k. The signature serves as the evidence of receipt for the 
key. Together with the evidence of receipt for the cipher c, they form the non- 
repudiation of receipt evidence of the message m. 

Alice may also initiate the recovery protocol with the TTP if this last message 
does not arrive in time. 



3.4 Recovery Protocol 

To launch the recovery protocol Alice or Bob has to send to the TTP the hash 
of c, the key k ciphered for the TTP, the evidence of origin for the cipher c, 
EOO, the evidence of origin for the encrypted key. Sub, the evidence of receipt 
for the cipher c, EOR, as well as the evidence of origin for the recovery request, 
Recx (where X may take the values A oi B). Note that the recovery protocol 
can only be executed once per protocol run and is mutually exclusive with the 
abort protocol. 

By the mean of these evidences the TTP can be sure that Alice sent the 
cipher c to Bob and that Bob really received it. 




114 O. Marko witch and S. Kremer 



1. X ^ TTP : /Rec^ , /sub, F, I, h{c),ETTp{k), Recx, Sub, EOR, EOO 

if aborted or recovered then stop 

else recovered=true 

2. TTP A: fcon^, A, B,l,k,Cor\k,ROR 

S.TTP^B: fconk,A,B,l,k,Conk 

When the first message arrives, the TTP checks wether an abort protocol or 
a recovery protocol has already been started for this protocol run: a protocol run 
is uniquely identified by the label I = h{m, k) and the identities {A, B). If either 
an abort or a recovery protocol has already been initiated the TTP halts. The 
resilient channels assure that some message ending the protocol will eventually 
arrive at X. If the TTP accepts to perform a recovery protocol, the TTP sends 
to Alice the confirmation of submission of the key, as well as the evidence of 
receipt for the cipher EOR. It is important to include EOR, as Bob can initiate 
the recovery protocol after having received the cipher, without having sent a 
receipt for it. The TTP sends to Bob the key k, as well as the confirmation of 
the submission of the key, serving to Bob as an evidence of origin for k. 

If the recovery protocol is executed, the key confirmation evidence Cotifc will 
make part of the non-repudiation evidences for the message m. It is used to 
replace both the evidence of origin for the key as well as the evidence of receipt 
for the key. 

3.5 Abort Protocol 

Alice has the possibility to run an abort protocol. If she decides to do so she 
sends a signed abort request, including label I, to the TTP. 

If the TTP accepts the request (neither a recovery nor an abort has yet been 
initiated), the TTP sends to both Alice and Bob a signed abort confirmation. 

if recovered or aborted then stop 
aborted=true 

1. A ^ TTP : /Abort, I, B, Abort 

2. TTP ^ A: fcon^,A,B,l,Cona 
S.TTP^B: /con„, A, B,i,Coria 

Note that Alice could specify a wrong B in her abort request. However this 
would refer to a different protocol run — a protocol run is identified by I and 
(A, B) — and would enable Bob to launch a recovery protocol. 

3.6 Dispute Resolution 

The non-repudiation of origin and receipt evidences for message m are the fol- 
lowing : 

- NRO = (EOO, EOOfc) or NRO = (EOO, Corifc) 

- NRR = (EOR, EORfe) or NRR = (EOR, Corife) 
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Repudiation of Origin. When Alice denies the origin of the message, Bob has 
to present to the judge EOO, EOOfe or Corife, I, c, m and k. The judge verifies that 



- EOO = SA{fEOO,B,l,c), 

- EOOfc = SAifsoOk: B, I, k) or Corife = 5'rrp(/confc, A, B, I, k), 

- I = h{m, k), 

- c = Ek{m). 

If Bob can provide all the required items and all the checks hold, the adjudicator 
claims that Alice is at the origin of the message. 



Repudiation of Receipt. When Bob denies receipt of m, Alice can prove his 
receipt of the message by presenting EOR, EORfc or Cotifc, I, c, m and fc to a 
judge. The judge verifies that 

- EOR = SB{.f EOR, A, I, h{c)), 

- EORfc = SB{fEORk,A,l,k) or Corifc = STTp{fconk,A,B,l,k), 

- I = h{m, k), 

- c = Ek{m). 

If Alice can present all of the items and all the checks hold, the adjudicator 
concludes that Bob received the message. 

3.7 Fairness and Timeliness 

If Bob stops the protocol after having received the first message, Alice may 
perform the abort protocol, in order to avoid Bob to initiate a recovery later. As 
neither Bob nor Alice received complete evidences the protocol remains fair. If 
Bob had already initiated the recovery protocol, the TTP sends all the missing 
evidences to Alice and Bob. Note that the TTP also sends the EOR to Alice, as 
she has not received it yet. Thus the protocol stays fair. 

If Alice does perform step 3, Bob receives a complete non-repudiation of origin 
evidence. There are two ways to finish the protocol: either Bob sends message 
4 of the main protocol and Alice receives a complete non-repudiation of receipt 
evidence or Alice performs the recovery protocol. As the channels between the 
TTP and both Alice and Bob are resilient, all data sent by the TTP to Alice 
and Bob eventually arrive. In both cases all entities receive valid evidences and 
the protocol finishes providing fairness. 

If Alice does not send message 3 during the main protocol, Alice and Bob 
may initiate the recovery protocol. Fairness is still guaranteed, as during the 
recovery protocol, Alice and Bob receive all expected evidences. 

In the case where Alice sends a wrong key k' either to Bob or to the TTP 
{ETTp{k')) the evidences will not be valid as I ^ h{m, k'). If Alice also transmits 
the label h{m, k'), the generated evidence will correspond to the message m' = 
Dk'{c). As m' is the message Bob actually received, fairness is still provided. 
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Note that the protocol achieves strong fairness. 

When looking at the timeliness, three situations may arrive: the main proto- 
col ends up successfully (without any time-out); Alice aborts the protocol and 
the abort confirmation signed by the TTP arrives at Alice and Bob after a fi- 
nite amount of time, as the channels between the TTP and both Alice and Bob 
are resilient; a recovery protocol is performed and Alice and Bob receive the 
evidences after a finite amount of time because of the resilience of the channels. 

4 Multi-party Non-repudiation 

In literature, different kinds of multi-party fair exchange have been considered. 
In [9] a classification has been proposed. One can differ between single-unit 
and multi-unit exchanges. Moreover different topologies are possible: [9] and [5] 
concentrated on a ring topology. Each entity (0 < i < n — 1) desires an item 
(or a set of items) from entity e^Bi and offers an item (or a set of items) to entity 
Giffli, where ffl and B respectively denote addition and subtraction modulo n. 
Another topology is the more general matrix topology, where each entity may 
desire items from a set of entities and offer items to a set of entities. Such 
protocols have been proposed by Asokan et al. in [3] and [2] . 

A fundamental difference between non-repudiation and fair exchange is the 
following. In non-repudiation, the originator sends some data with a non-repu- 
diation of origin evidence to a recipient, who has to respond with a non-repudia- 
tion of receipt evidence. The sent data is generally not known to the recipient a 
priori. In a fair exchange each entity offers an a priori known item and receives 
another item, also known a priori. In a multi-party fair exchange protocol one 
can imagine sending an item to one entity and receiving an item from a different 
one. In non-repudiation it does not make sense that one entity receives some 
data and a distinct entity sends the corresponding receipt. Thus a ring topology 
is not sound. The most natural and here considered generalization seems to be a 
one-to-many protocol, where one entity sends a message to n— 1 receiving entities 
who respond to the sender. However other possibilities for generalization exist 
(many-to-one, many-to-many) although they seem to be less natural. 

The expectations we have towards such a protocol are rather similar to the 
properties required in two-party non-repudiation. A multi-party non-repudiation 
protocol is said to provide strong fairness if at the end of the protocol the sender 
has got a complete non-repudiation of receipt evidence for a given recipient if 
and only if this recipient has got the message with a complete corresponding 
non-repudiation of origin evidence. 

A multi-party non-repudiation protocol is said to provide weak fairness if either 
the protocol provides strong fairness, or the protocol provides evidences that can 
prove to an adjudicator until which state the protocol has been executed with 
each of the receivers. 

Here we can clearly see the difference with the certified mail protocol pro- 
posed by Asokan et al. Their protocol requires that at the end of the protocol all 
receivers have got the message with corresponding origin evidence and that the 
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sender has got a receipt for every receiver, or none of them gained any valuable 
information. A last required property is the timeliness property, defined as for 
two-party protocols. 

5 A Multi-party Optimistic Non-repudiation Protocol 

We propose a generalization of the presented two-party non-repudiation protocol, 
using an off-line TTP. We suppose that the channels between the TTP and both 
Alice and all possible receivers are resilient. The channels between Alice and any 
receiver may be unreliable. 

5.1 Notations 

The following notation will be used: 

- B'. the set of receiving entities 

- B\ multicast from Alice to the set of entities B 

- EsQ: a group encryption scheme E, that can be deciphered by each party 

Pg£ 

- B'\ the set of receiving entities having sent an evidence of receipt for the 
cipher to Alice 

- I — h{m, k): a label that in conjunction with the identity A uniquely identifies 
a protocol run 

- EOO = SA{fEOO,B,l,t,h{c)) 

- EOR, = S B, Ueor, A, I, h{c)) 

- Sub ~ SAifsub, B, I, ETTp{k)) 

- E00k = SA{fEOO„B', I, h{k)) 

- EORi^k = SBiifEORkjA,l,h{k)) 

- R^Cx = Sx{fRecxjA,B,l) 

- Cor\k = STTpifconkJ A, B' , I, h{k)) 

- Early = S'TTp(/Early, 0 

- Set =5.4 (/set, S', 0 

5.2 Main Protocol 

In the main protocol Alice sends a cipher of the message to all potential re- 
ceivers B. Several of these receivers (S') are sending a receipt for this cipher. 
Alice continues the protocol with the receivers in S' by sending them the key 
corresponding to the cipher of message 1. 



1. A^S: 

2 . Bi^ A: 

3 . A ^B' : 

4. S' ^ A : 



/eoo, /sub, S, I, t, c, ETTp{k), EOO, Sub 
/eor, a, I, EORi 

where Bi G B and i G {1, . . . , |S|} 
/EOOfc,S',;,Sg/(A:), EOOfc 
/EORfc, A, I, EORi,fe 
where S' G S' and i G {1, . . . , |S'|} 
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The first message destined to all receivers in B includes the label, the cipher, 
as well as the key k ciphered using the public key of the TTP (this information 
is used by the TTP in the case of a recovery). Alice also sends a time-out t: a 
recovery may only be performed after t. If one of the receivers does not accept 
the time-out he may stop the protocol. 

After having sent the cipher in message 1, Alice decides of the moment to 
continue the protocol. All receipts arriving after this moment are not considered 
any more, without any risk for the receivers of losing fairness. To realize a service 
similar to the certified mail presented in [2], Alice may stop the protocol if not 
all of the receivers in B have answered. 

Afterwards, when Alice sends the deciphering key it is crucial that only the 
recipients in B' receive it. Therefore we need to use a cipher (this is not needed 
in a two-party protocol). In order to cipher only once and to use multicasting, 
Alice uses a group encryption scheme. The idea is that the key can be ciphered 
in a way such that only recipients B[ G B' can decipher it. Examples of such 
ciphering methods can be found in [8] and [10]. 



5.3 Recovery Protocol 

At each moment during the main protocol Alice and Bob have the possibility 
to launch the recovery protocol with the TTP. The recipients in B' launch a 
recovery if Alice does not multicast the ciphered key; Alice initiates the recovery 
protocol, if not all recipients in B' send a receipt for the ciphered key. 



1. X ^ TTP : /rscxi fsub, A,B,l,t,h{c), ETTp(k), Recx,Sub, EOO 
if recovered then 

TTP ^ X : /copfc, A, .S', Ae/(fc, 5'rrp(fc)), Corife 

else if before t then 

TTP-^X: /Early, Early 

else 

recovered=true 

2. TTP ^ A: /set, S',;, Set 

3. TTP ^ A: /con.,A,S',;,Confc 

4. TTP ^ S' U {X}\{A} : fcou, , A, S', I, Sg/fc, STTp(k)), Con^ 



Before t (t is specified by Alice during the first message in the main protocol 
and it is resent together with Alice’s signature during the recovery request) the 
recovery cannot be initiated. 

If someone tries to execute a recovery before t, which is resent together with 
Alice’s signature during the recovery request, the TTP sends a message to notify 
that the recovery cannot yet be initiated. 

When the recovery protocol is initiated the first time after t, the TTP uses 
ftp get to fetch the set B' at Alice. For this purpose Alice maintains a read-only 
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accessible directory containing the set of users and her signature on this set. If 
the directory is not accessible the TTP supposes B' — 

As soon as the TTP made the ftp get on Alices’s public directory, Alice 
considers that a recovery is in execution and stops execution of the main protocol. 
Otherwise a malicious Bi could be inserted in B' after the ftp get, and Bi could 
benefit of a race condition to cheat Alice. 

Now the TTP sends to Alice the confirmation of receipt of the key that may 
be used to substitute EORy^ for all i such that Bi G B' . The TTP sends to all 
receiver Bi G B' the confirmation of the key, that substitutes EOOfc, as well as 
the signed key ciphered for B' using a group encryption scheme. 

If a receiver Bi ^ B' wants to perform a recovery, after the recovery has 
already been performed for the first time (the TTP uniquely identifies each 
protocol run by I and A), the TTP sends to this entity the same message as to 
each Bi G B' . This message is however useless as k has been ciphered for the set 
B' and only informs the recipient that he does not belong to this set. 



5.4 Dispute Resolution 

At the end of a successful protocol execution, each recipient Bi G B' and Alice 
receive the following non-repudiation of origin respectively receipt evidence for 
message m: 

- NRO = (EOO, EOOfc) or NRO = (EOO, Corifc) 

— NRRi = (EORi, EORi^fe) or NRR = (EOR^, Con^) 



Two kinds of disputes can arise: repudiation of origin and repudiation of 
receipt. Repudiation of origin arises when a recipient Bi claims having received 
a message m from Alice, who denies having sent it. Repudiation of receipt arises 
when Alice claims having sent a message m to a recipient Bi who denies having 
received it. 



Repudiation of Origin. When Alice denies the origin of the message, Bi has 
to present to the judge EOO, EOOfc or Conj,, I, m, k, B and B' . 

The message m and the key k have to be sent to the judge via a secure 
channel, for example using encryption. Otherwise a recipient Bi G B\B' can 
recover the transmission. If any of these informations cannot be provided the 
recipient’s claim is rejected. 

The judge validates the recipient’s claim if he can successfully verify that: 



— EOO = SA{fEOO,B, I, h{c)) after having computed c = Ek{m) and h{c), 

— EOOfc = SA{fEOOk:^\ h h{k)) or Corifc = STTp{fcouk, A, B', I, k), 

— I — h{m, k). 
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Repudiation of Receipt. When Bi denies receipt of m, A can prove his receipt 
of the message by presenting EOR^, EORi^fe or Corifc, I, m, k and B' to a judge. 
As above a secure channel is needed to transmit m and k. 

To accept Alice’s claim, the judge verifies that 



— EORi = SstifEOR, A, I, h{c)) after having computed c = Ek{m) and h{c), 

— EORfc = SstifEORk: A,l, h{k)) or Corifc = 5'TTp(/confe, 

— I — h{m, k). 



5.5 Fairness and Timeliness 

Our generalized protocol provides strong fairness. In fact when the main protocol 
ends without problems, the non-repudiation evidences have been exchanged and 
all receivers in B' got the message m. When a problem arises both Alice and 
Bob can initiate the recovery protocol at each moment following the transmission 
of message 1. As outlined in the previous section, Alice maintains a read-only 
accessible directory that is consulted by the TTP during the recovery to get the 
description of the set B' . The TTP then sends the missing evidences to both 
Alice and the entities in B' . 

If Alice tries to cheat by submitting a wrong key k! , the generated evidences 
will not be valid as I ^ h{m, k'). If Alice also transmits I = h{m, k!), the evidences 
will be generated for the message m' — Dk'{c) and Alice can only prove that 
Bi received m' . Alice could also try to cheat by publishing a wrong set B' . 
Publishing a set smaller than B' does not provide an advantage as Alice would 
not receive the confirmation of receipt of the key for the entities in B'\B' . If B' 
is bigger than B' , Alice would harm herself as all the entities in B'\B' receive 
k with a confirmation of origin for k, while Alice does not have an evidence of 
receipt for the cipher of those entities. Hence Alice does not have any interest in 
publishing a different set. 

Now consider a scenario where at the second step several receivers send the 
evidence of receipt. After a fixed amount of time, Alice continues the protocol. 
Now, if a group of late receipts arrive at Alice, she will possess an evidence of 
receipt for the cipher c from these receivers. However fairness is not threatened 
by this scenario as Alice will not receive a confirmation of the receipt for the 
key from those entities. If Alice would include these receivers in B' , they would 
also receive k and the corresponding evidences of origin. So strong fairness is 
still provided. 

We shall now show that timeliness is also respected. Alice has two possi- 
bilities: either she finishes the main protocol or she has to initiate a recovery 
protocol. The timeliness in the later case is assured by the resilient channels. 
A recipient Bi can either finish the main protocol or launch a recovery. If he 
launches a recovery several cases may arise. He may be too early (before t) and 
he has to relaunch the recovery after t {t has been known a priori when the 
recipient agreed to continue the protocol) . Otherwise, if Bi successfully initiates 
the protocol, the TTP will send him the ciphered key with the corresponding 
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confirmation of origin. As the channels between the TTP and Bi G B are resilient 
the protocol finishes after a finite time preserving timeliness. 

6 Conclusion 

We have defined multi-party non-repudiation and presented a generalization of 
an optimistic two-party non-repudiation protocol to an n-party non-repudiation 
protocol. To the best of our knowledge this is the first optimistic multy-party 
non-repudiation protocol. We have shown that the generalized protocol provides 
strong fairness and respects timeliness. 

Acknowledgments. The authors would like to thank the anonymous referees for 
their helpful comments on the draft version of this paper. 
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Abstract. Matchmaking protocol is a procedure to find matched 
pairs in registered groups of participants depending on their choices, 
while preserving their privacy. In this study we define the concept 
of matchmaking and construct a simple and efficient matchmaking 
protocol under the simple rule that two members become a matched 
pair only when they have chosen each other. In matchmaking protocol, 
participant’s privacy is of prime concern, specially losers’ choices 
should not be opened. Our basic approach to achieve privacy is finding 
collisions among multiple secure commitments without decryption. For 
this purpose we build a protocol to find collisions in ElGamal ciphertexts 
without decryption using Michels and Stadler’s protocol [MS97] of 
proving the equality or inequality of two discrete logarithms. Correctness 
is guaranteed because all procedures are universally verifiable. 

Keywords: matchmaking, secure multiparty computation, proof of 
knowledge, proving the equality or inequality of two discrete logarithms, 
finding collisions without decryption, public commitment. 



1 Introduction 

Consider a set of parties who trust neither other entities nor the channels by 
which they communicate. The parties wish to correctly compute some common 
function of their local inputs, while keeping their local data as private as pos- 
sible. This is the problem of secure multiparty computation [Yao82], [GMW87], 
[CCD88], [BGW88], [Can96], [Gol98], [Cra99], which has fundamental impor- 
tance in cryptography and is relevant to many distributed cryptographic appli- 
cations such as electronic cash, voting, auction, and so on. 

In this study we consider a problem of finding matched pairs in registered 
groups of participants depending on their choices, while preserving their privacy. 
This is an interesting example of secure multiparty computation where the com- 
mon function is finding matched pairs in the registered groups of participants 
and local data are participants’ choices. 

There is a popular TV program in Korea which performs matchmaking be- 
tween two registered groups of men and women. In the program all participants 
commit their choices to a host secretly, but in the opening stage the host opens 
all the choices and decides couples. The established couple members are OK 
for opening their choices, but losers may wish that their choices might not be 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 123-134, 2001. 
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opened if possible. A loser can have another chance to participate in matchmak- 
ing and in that case his or her previous choice is important private information. 
So secrecy of commitment is a prime security issue in this case. 

Another example can be found in setting up project teams in a class. The lec- 
turer of the class tries to permit students to form project teams of two members 
if they want each other. The choices of established team members are published, 
but losers who could not form a team may wish that their choices might not be 
opened if possible. 

In this study we consider a typical model of secure matchmaking protocol 
which is used to set up couples among m male members Mi(i = and 

n female members Fj(j = l,...n). The basic rule of matchmaking is that in 
commitment stage each participant commits a single choice to TTP secretly and 
then in opening stage two participants are decided as a couple only when they 
have chosen each other. Our basic approach to provide secrecy of commitment is 
finding collisions in ElGamal ciphertexts without decryption. For this purpose we 
use Michels and Stadler’s protocol [MS97] of proving the equality or inequality 
of two discrete logarithms. Our design provides the secrecy of commitment and 
guarantees the correctness of results. 

This paper is organized as follows. In section 2, we describe our model of 
matchmaking and its security requirements. Then in section 3, we describe some 
building blocks such as public commitment, proving the correctness of decryp- 
tion, and finding collisions in ElGamal ciphertexts. Using these primitives we 
construct a simple and efficient matchmaking protocol in section 4 and provide 
its security analysis in section 5. 



2 Model of Matchmaking 

2.1 Definition of Terms 

In this paper the following terms are used in a specific sense, so we need to define 

them more rigorously. 

Matchmaking is a protocol to find matched pairs < ai,bj > between two 
registered groups of participants A = (oi, ...,am) and B = {b\, ...,b„) which 
satisfy < Uj, bj >G R for a special relation R we try to find. 

Established couple is a matched pair < Ui,bj >G R which is found using the 
matchmaking protocol. 

Loser is a participant who was not established as a couple. All participants 
except the established couple members are losers. 

Registered group of participants is the members registered in the match- 
making system who participate in the matchmaking process and try to find 
a partner there. 

Public commitment is a commitment scheme with which a participant com- 
mits his or her choice to the public. The correctness of result should be 
publicly verifiable. 
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Collision in ciphertexts is the case that two ciphertexts are probabilistic en- 
cryptions of the same message. 

Proof of coupling is a proof for the correctness of the established couple. 

2.2 Participants and Tools 

Our matchmaking protocol has the following participants and tools. All partici- 
pants are assumed to have own public/private key pairs certified by a certificate 

authority(CA). 

Male participants: m male members Mi{i = 1, want to find partners 

among the female members. 

Female participants: n female members Fj{j = 1, ...,n) want to find partners 
among the male members. 

TTP(Host): A trusted third party T is a host of matchmaking and medi- 
ates the matchmaking procedure, and can be modeled as a probabilistic 
polynomial-time Turing machine. T receives participants’ public commit- 
ments as input and outputs the results of matchmaking and proofs of them. 
W.l.o.g., it is assumed that T does not collude with any specific participant. 

Bulletin board: It is a public communication channel which can be read by 
anybody. Only legitimate participant is allowed to post his or her message 
on the specified region of the bulletin board. All participants communicate 
via the bulletin board. 



2.3 Rule of Matchmaking 

In this study we consider the following simplified model of matchmaking between 
two registered groups of men and women. 

— Each participant in a group has a single choice among the participants of 
the other group. 

— Two participants are decided as a couple only when they have chosen each 
other. 

In the real world there can be a variety of situations and rules for match- 
making. Some possible examples are committing multiple choices, setting up a 
team of multiple partners, etc, and each can be a good model for the study of 
secure multiparty computation. 



2.4 Security Requirements 

The main security concern of matchmaking is the secrecy of participants’ choices. 
A participant wants to keep the secrecy of his or her choice as private as possible 
while trying to get any information on others’ choices. Even the established 
couple members may want to know who else have chosen him or her except the 
current partner. A loser may want to know who have chosen him or her, because 
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it can be important information for the next round of matchmaking while it is 
also private information for the loser. Therefore secrecy of commitment should 
be provided such that malicious participants cannot get any partial information 
on others’ choices. 

Another scenario is that a participant A can try to help other participant 
B, i.e., A helps B to setup a couple < B,C >. If this is possible, B gets to 
have two choices against the fairness of the matchmaking rule. So the authen- 
ticity of commitment is required and the correctness of result should be publicly 
verifiable. 

Still another scenario is that a member of an established couple may want 
to change his or her mind later and tries to repudiate the result. It is clear that 
this should not be allowed. 

The security requirements of secure matchmaking protocol can be listed as 
follows. 

Secrecy (privacy) . The choices of participants should not be exposed to others 
including TTP in the whole process of matchmaking. Only the choices of 
established couple members are published in the opening stage. 

Fairness. In the commitment stage, anyone cannot be in advantageous posi- 
tion than others, i.e., anyone cannot have any partial information on other’s 
choice. 

Correctness. The correctness of the result of matchmaking should be publicly 
verifiable. 

Authenticity. The authenticity of commitment should be provided such that 
each registered participant has committed a single choice can be verified. 
Non-repudiation. The established couple members should not be able to re- 
pudiate their commitments later. 



3 Building Blocks 

In this section we describe some building blocks used in the proposed match- 
making protocol. 

3.1 Notation 

The basic cryptographic primitives used in our protocol are ElGamal public key 
encryption, digital signature, and zero-knowledge proof, which are commonly 
based on the discrete logarithm setting. Let Z* denote the multiplicative group 
of prime modulo p. We consider a cyclic subgroup of Z* of prime order q with 
< 7 |(p — 1). Let g be the published generator of the subgroup of order q. TTP has 
a certified public key y = and its corresponding private key x. In this paper 
we use the following notation. 

Sigi{m) is a digital signature of a participant i for a message m where the 
signature scheme is secure against existential forgery. 
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Ej(m) is a probabilistic ElGamal encryption of a message m with the partici- 
pant j’s public key. 

Env{i, j,m) = [c||s] = [Ej{m)\\Sigi{c)] is an enveloped message of a plaintext 
m sent by a sender i to a recipient j . The plaintext m is signed by the sender 
i and encrypted with the recipient j’s public key. 

Eli) is a collision resistant hash function. For security proof, it is considered as 
a random oracle. 



3.2 Public Commitment Scheme 

Commitment scheme is a digital implementation of a sealed box which is opened 
later. In a two party computation, commitment scheme is a two-phase protocol 
of commitment and reveal stages, and should satisfy the requirement of secrecy 
and unambiguity. The situation in matchmaking protocol is a multiparty com- 
putation, so we define a new concept of public commitment scheme where a 
message is committed to the public, revealed by the help of TTP, and is publicly 
verified. 

Definition 1 (Public commitment scheme). 

Public commitment scheme is a 3-stage protocol between multiple participants 
Pi,...,P„ and a TTP which consists of: 

1. Commitment: Multiple participants commit their secure messages to the pub- 
lic in a secure way such that only TTP can open it. 

2. Reveal: TTP opens the commitments and provides the proofs of results. 

3. Verification: Anyone verifies the correctness of the results. 

And it satisfies the following requirements: 

1. Secrecy: At the end of stage 1, anyone except the sender cannot tell what 
value is being sent. 

2. Unambiguity: Given a commitment of stage 1, there is at most one value 
everybody may accept as valid. It means that the commitment is bound to a 
value. 

3. Non-malleability: Given a commitment of stage 1, anyone except the sender 
cannot generate another legal commitment which has a message related with 
the original message. 

4 . Non-repudiation: Once a participant committed a message to the public in 
stage 1, he cannot repudiate it later. 

In this study we implement the public commitment scheme using the en- 
veloped message defined above. In commitment stage, a participant Pi commits 
a message mi to the public as Env{i, T, mi) which is encrypted with TTP’s pub- 
lic key. In reveal stage, TTP recovers the message and publishes it with a 
proof of the correctness of decryption. In verification stage, anyone verifies the 
correctness of the message. 
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Lemma 1. Let be multiple participants and T he a TTP. A 

commitment scheme using an enveloped message Env{i,T,m) = [c||s] = 
[ET(jn)\\Sigi{c)] is a public commitment scheme between Pi and T. 

Proof, (sketch) The probabilistic public key encryption scheme provides secrecy 
and unambiguity assuming that T does not collude with participants. Consid- 
ering the result of [TY98] and [SJOO], the enveloped message (a combination of 
a public key encryption and a digital signature) can be considered as a non- 
malleable encryption scheme. So non-malleability is satisfied. The usage of digi- 
tal signature provides non-repudiation. □ 

3.3 Proving the Correctness of Decryption 

Firstly we describe the zero-knowledge proof of knowledge protocol for the equal- 
ity of two discrete logarithms [CP93], [CGS97]. For universal verifiability, we 
consider its non-interactive version using the Fiat-Shamir heuristics [FS87] and 
a collision resistant hash function. The 3-move interactive protocol is an honest 
verifier zero-knowledge and the security of the non-interactive version is obtained 
under the random oracle model [CGS97]. 

Let a and /3 be two independent elements of the cyclic group Zf of order q, 
i.e., nobody knows the relative discrete logarithm log^/3. 

Protocol 1. Proving the equality of two discrete logarithms 

The prover wants to prove that two elements y = and z = (3^ have the 
same discrete logarithm for base a and f3, respectively, without exposing x, i.e., 
he wants to prove that log^y = logj^z holds. {a,y,(3,z) are given as common 
input. 

Prover 

— Ghooses k Gr Z*. 

— Gomputes ra = , ry = j3^ . 

— Gomputes V = i7(o;, y, /3, z, Tq,, r^). 

— Gomputes s = k — vx. 

— Publishes {ronry,s). 

Verifier (anyone) 

— Gomputes V = iJ(a, y, /3, z, Tq,, r/ 3 ). 

— Verifies Tq = a^y’^, ry = j3‘^z'’ . 

Now let the public key of a recipient i? be y = y“. When the recipient gets 
an ElGamal ciphertext (a, 6 ) = (y^,y^m), he can recover the message m by 
decrypting it with his private key x and prove its correctness by exposing x. If 
he wants to prove the correctness of decryption without exposing x, he can do 
it using the following protocol. 

Protocol 2. Proving the correctness of decryption 

Let the public key of a recipient i? be y = y^. He wants to prove that m is 
the plaintext of a ciphertext (a, b) = (g^, y'^m) without exposing the private key 
X. He can prove it by showing log^y = log^( 6 /m) using Protocol 1. 
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Lemma 2. Protocol 2 proves the correctness of decryption. 

Proof, (sketch) The public key y = of the recipient R is publicly known. So 
showing loggy = log^(6/m) is equivalent to showing b/m = a^, which proves the 
correctness of decryption. □ 



3.4 Proving the Equality or Inequality of Two Discrete Logarithms 

Assume that the prover knows the discrete logarithm x of y = for the base a.. 
He wants to allow the verifier to decide whether log^y = log^z or log^y yf log^z 
for given group elements [3 and z, such that no partial information about the 
logarithm x is leaked. 

Protocol 1 provides only the proof of equality of two discrete logarithms. 
If this proof fails, all the arguments of prover are invalid. [MS97] provides the 
proof of equality or inequality at the same time. Zero-knowledge interactive 
proof system convinces only the verifier and requires interaction, but in our 
situation of matchmaking the prover (host of matchmaking) wants to convince 
all the participants, so we use the non-interactive version of their scheme which 
provides universal verifiability. 

Protocol 3. Proving the equality or inequality of two discrete loga- 
rithms 

The prover wants to convince the public whether the two elements y and z 
have the same discrete logarithm or not for the base a and j 3 , respectively, 
without exposing the discrete logarithm, i.e., he wants to prove that either 
log^y = log^z or log^y yf log^z holds, {a, j 3 , y, z) are given as common input. 

Prover(TTP) 

— Chooses k,k' Gr Z*. 

— Computes ra = , r/s = ( 3 ^ , r'^ = and r'^^ = . 

- Computes v = H{a,y, P, z,ra,rfj,r'^,r'p). 

— Computes s = k — vx and s' = k' — vk. 

- Publishes (ra,r^,r(,,r(3,s,s')- 
Verifier (anyone) 

- Computes v = H{a,y, P, z,ra,rf3,r'^,r'p). 

— Verifies ra = a^’y", = a" rf, rp = P" If the verification fails, stop 

the verification process. 

- If r/3 = P"z'", then log^y = log^z. Else if rp y^ P'^z", then log^y yf log^z. 



3.5 Finding Collisions in ElGamal Ciphertexts 

When TTP has a public key y = g"' and the corresponding private key x, 
the ElGamal encryption of a message m with TTP’s public key y is given by 
(a,b) = {g^,y^m) where k Gr Z*. Now assume that TTP has received two 
ciphertexts (ai,&i) = {g^^,y^^nii) and (02,62) = {g^'^ , y^^ 1TI2) . Of course TTP 
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can recover two messages mi and m2 using his private key x and publish them, 
but TTP wants to convince the public whether the two ciphertexts have the 
same message or not, without exposing any partial information on the messages 
or his private key x. This task can be obtained as follows using Protocol 3. 

Protocol 4. Finding collisions in ElGamal ciphertexts 

Let y = he TTP’s public key and x be his private key. Given two ElGa- 
mal ciphertexts (ai,6i) = and (02,62) = (5^^, y*^W2), TTP wants 

to prove whether the two ciphertexts have the same message or not, without 
exposing any partial information on the messages or his private key x. 

Prover(TTP) 

- Gomputes = (01/02,61/62) = y'""“'"^mi/m2). 

— Proves that log^y = log„,j63 or log^y yf log^,j63 using Protocol 3. 

Verifier (anyone) 

- If logg?/ = log^^bs, then mi = m2. Else if log^y yf log„3 63, then mi yf m2. 



Lemma 3. Protocol 4 proves whether two ciphertexts (oi,6i) and (02,62) have 
the same plaintext or not. 

7 

Proof, (sketch) TTP’s public key y = is publicly known. So proving log^y = 

? ? 

log^^bs is equivalent to proving of = 63, which proves mi = m2. □ 

Protocol 4 can be used in wide range of applications where just the proof of 
equality or inequality of plaintext is required while the plaintext should not be 
recovered. 



4 Proposed Matchmaking Protocol 

In this section we describe our matchmaking protocol which is constructed using 
the primitives described above. The participants are m male members Mi(i = 
1 , ...,m), n female members Fj{j = 1 , ...,n) and a host T. They use the bulletin 
board as a public communication channel. 

The proposed matchmaking protocol consists of 5 stages: registration, 
commitment, opening, proof of coupling, and verification. In the sequel we 
will describe the case that Mi and Fj have chosen each other for the ease of 
description. 

Stage 1. Registration. 

Any applicant who wants to participate in matchmaking should register to 
T. Each participant does the following steps: 

— Mi chooses a random number Z* and computes his temporal ID 

Tm = 9°'' ■ Likewise Fj chooses a random number bj Z* and computes 
her temporal ID fj = g^^ . 
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— Mi presents to T his name, his public key with certificate, and his temporal 
ID. Likewise Fj presents to T her name, her public key with certificate, and 
her temporal ID. 

— T publishes these information on the bulletin board. 

Stage 2. Commitment (by participants). 

In commitment stage each participant chooses a member from the other 
group as a possible partner, generates a couple ID for the choice, generates a 
public commitment by encrypting the couple ID with TTP’s public key, and 
posts it on the bulletin board. Note that each participant is allowed to commit 
a single choice. 

To generate the couple ID, we use the Diffie-Hellman key agreement tech- 
nique. A couple ID between Mi and Fj is given by CID(Mi, Fj) = Fl{mi) = 
H{fj*) = and it can be computed only by Mi and Fj. 

Each participant does the following steps: 

^ Chooses a possible partner from the other group. 

— Computes a couple ID for the choice. If Mi has chosen Fj, he generates his 
couple ID with his random number and the partner’s published temporal 
ID fj such that CID{Mi,Fj) = iL(/“0- 

— Generates a public commitment for the choice and posts it on the bulletin 
board. Mfs public commitment is given by Env{i,T,CID{Mi, Fj)). 

Stage 3. Opening (by TTP). 

When the deadline of commitment has passed, TTP tries to find couples using 
Protocol 4 for every possible pairs of participants {Mi, Fj) where i = 1, ...,m and 
j = 1, ...,n. TTP posts the results and their proofs on the bulletin board. 

For the established couple members, TTP additionally decrypts their com- 
mitments and publishes the couple IDs, which demonstrate that the two com- 
mitments are equal. TTP proves the correctness of his decryption using Protocol 
2 . 

TTP does the following steps: 

— TTP tries to find couples for every possible pairs {Mi,Fj) using Protocol 4 
and posts all the results and proofs on the bulletin board. 

— For the established couple members, TTP decrypts their public commitments 
to get the colliding couple ID and publishes them on the bulletin board. He 
provides proofs of correctness of the decryptions using Protocol 2. 

Stage 4. Proof of coupling (by established couple members). 

The participants who were published as a couple in the opening stage have 
to show that the colliding couple ID is authentic. A simple way to prove the 
authenticity of couple ID is showing the random number corresponding to 
participant’s temporal ID. CID{Mi, Fj) can be proven authentic by verifying 
CID{Mi, Fj) = H{fj') if Mi opens a* or by verifying CID{Mi, Fj) = P[{m\^) 
if Fj opens bj. 
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If the established couple members do not want to open their secret random 
numbers, the authenticity of CID{Mi, Fj) can be proven using Protocol 2. Mi 
can prove loggirn = \ogf.{CID{Mi,Fj)) without exposing Oi. 

One of the couple members cannot repudiate his or her commitment while 
the other member does not want to repudiate, because both of them can prove 
the authenticity of CID{Mi, Fj). If both of them refuse to prove it(changed 
their mind together), they will not be decided as a couple. Based on their proofs 
of coupling, everyone can verify the correctness of coupling. 

Stage 5. Verification (by anyone). 

Anyone can verify the correctness of results by: 

— Verify the correctness of TTP’s opening in stage 3. 

— Verify the authenticity of the colliding couple ID published in stage 4. 

— Verify the couple member’s signature for the public commitments posted in 
stage 2. 

5 Security Analysis 

Our proposed matchmaking protocol satisfies all the listed security requirements. 

Theorem 1. The proposed matchmaking protocol is secure in the sense that it 
satisfies secrecy, fairness, correctness, authenticity, and non-repudiation. 

Proof, (sketch) 

Secrecy (privacy) . All the commitments are encrypted with TTP’s public key. 
So if TTP does not help, any participant cannot get any partial informa- 
tion on the choices of others. Although TTP can decrypt a commitment, he 
cannot get any information on the choice without help of the specific couple 
members. Under the computational Diffie-Hellman assumption, identifying 
the couple (Mi,Fj) from Fl{g°‘'^i) is computationally infeasible. Only the 
choices of established couple members are published by TTP in the opening 
stage. 

Fairness. If TTP does not help, anyone cannot be in advantageous position 
than others in the commitment stage. If TTP helps a participant by secretly 
opening other group member’s commitment, he can check whether there is 
any choice for him by trying all the possible pairs. 

Correctness. Correctness of result is guaranteed because all the procedures 
are publicly verifiable. In opening stage, TTP tries to find couples for every 
possible pairs of participants and provides the proofs of result using Pro- 
tocol 4 whose correctness is given by Lemma 3. For the established couple 
members, TTP additionally decrypts their commitments and publishes the 
colliding couple ID. TTP proves the correctness of his decryption using Pro- 
tocol 2 whose correctness is given by Lemma 2. In proof of coupling stage, 
the established couple members prove the authenticity of the colliding couple 
ID by showing their random number. 
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Authenticity. In commitment stage, public commitment is signed by the 
sender. It guarantees the authenticity of commitment and the fact that each 
registered participant has committed a single choice is verified. 
Non-repudiation. In commitment stage, the committed message is signed 
by the sender. So the established couple members cannot repudiate their 
commitments later. In proof of coupling stage, one of the couple member 
cannot repudiate the commitment while the other member does not want to 
repudiate because both of them can prove the authenticity of the colliding 
couple ID. 

□ 

We can consider a situation when TTP is not honest. If TTP decrypts the 
public commitments and exposes them intentionally, each pair of participants 
can check whether the other participant has chosen him or her because both 
of them can generate the corresponding couple ID. Therefore TTP should not 
decrypt the commitments and expose them. 

Our proposed matchmaking scheme requires 0(mn) computation in open- 
ing stage because TTP has to try to find couples for every possible pairs of 
participants. Enhancing the efficiency of the protocol is left as further study. 



6 Conclusion 

We introduce a secure matchmaking protocol as a new application of secure 
multiparty computation and construct a simple and efficient protocol under the 
simple rule that two participants become a couple only when they have chosen 
each other. To implement secure matchmaking, we use various primitives of zero- 
knowledge proofs: proving the correctness of decryption, proving the equality or 
inequality of two discrete logarithms, and finding collisions in ElGamal cipher- 
texts. We also define the concept of public commitment scheme and show that 
enveloped message(a combination of public key encryption and digital signature) 
can be used for public commitment. Our basic approach is that participants com- 
mit their choices secretly by encrypting them with TTP’s public key and TTP 
tries to find collisions in ElGamal ciphertexts without decryption. 

The main security issue is the honesty of TTP which was assumed in this 
study. If TTP colludes with a participant, he can get partial information on other 
participants’ choices. The computational load of TTP in opening stage is 0{mn) 
because he has to try to find couples for every possible pairs of participants. In 
this sense more intensive study to enhance the efficiency of the protocol will be 
challenging very much. 

To the best of our knowledge, this is the first trial which applies crypto- 
graphic primitives to the problem of matchmaking. We believe that our result 
can play a significant role for narrowing the gap between cryptographic theory 
and real multiparty applications. 
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Abstract. Digital signatures are important applications of public key 
cryptography in today’s digital networks. However, they have a problem 
that anyone can verify the signature even if a signer wants to restrict the 
verifiers to confirm his signatures. D. Chaum et al. [1] proposed undeni- 
able signatures to solve this problem. These signatures are based on the 
discrete logarithm problem and are extended to those with different prop- 
erties [2-4]. After that, R.Gennaro et al.[5] proposed another undeniable 
signature scheme based on RSA. However, this scheme also has following 
problems. Firstly, the undeniable signature of them cannot be converted 
into a usual signature individually. So if a user wants to use both of the 
undeniable signature and the usual signature, he must prepare separate 
parameters for each type of signatures. Secondly, the denial protocol is 
not deterministic because it uses a zero knowledge interactive proof. So 
it is not efficient. Thirdly, their signature system cannot resist hidden 
verifier attack]?]. In this paper we will propose an improved scheme to 
solve these problems. 



1 Introduction 

Nowadays the use of public key cryptosystems is important not only for the 
encryption but also for the realization of a digitrJ signature system. The digital 
signature can produce the authentication of digital contents, i.e., anybody can 
verify whether the signer is the owner of these digital contents by checking their 
signature. 

UsucJ digital signature allows that anybody can verify the signature. This is 
not desirable in case a signer wants to restrict the verifier. 

Undeniable signatures proposed by D. Chaum and H. V. Antwerpen[l] is 
useful for signer to restrict the verifier, since signature cannot be verified without 
the signer’s cooperation. 

The signature scheme has both confirmation protocol and denied protocol, 
so that a signer does not refuse the request for verifying his signature. The 
signer uses the confirmation protocol to verify the valid signature and the denial 
protocol to prove the falseness of the invalid signature. 

Chaum’s undeniable signature scheme is based on the discrete logarithm 
problem. Its variations[2-4] also exist with different additionrd properties. 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 135-149, 2001. 

Springer- Verlag Berlin Heidelberg 2001 
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After that, R. Gennaro, H. Krawczyk and T. Rabin[5] proposed another 
undeniable signature scheme that is based on RSA, i.e., the factoring problem. 
The scheme can be used only for the undeniable signature, since the public key 
for encryption in usual RSA system is used as a private key in the scheme. 
The undeniable signature can be converted into in the usual RSA signature. 
However, by this type of conversion, all the previous undeniable signatures are 
converted to the usual RSA signatures which can be verified by anyone. This is 
one problem of the undeniable signature by Gennaro-Krawczyk-Rabin. Another 
problem is that it suffers from hidden verifier attack[6, 7]. The denial protocol 
is not deterministic and needs large costs for calculations and communications, 
since it is based on ZKIP. 

In this paper we’ll propose an improved scheme of Gennaro-Krawczyk-Rabin ’s 
one to solve these problems. At first, we’ll extend RSA key pair to a product 
of a public key and two private keys. When a signer wants to generate a usual 
RSA signature, he uses the product of two private keys as a private key of the 
RSA system. On the other hand, when he wants to generate our undeniable 
signature, he uses the product of the public key and one of the private keys 
as a key to sign a message. We’ll improve the probabilistic denial protocol in 
the Gennaro-Krawczyk-Rabin’s scheme to the deterministic one proposed on 
Chaum’s scheme. In addition, we’ll also improve the commitment protocol in 
order to resist hidden verifier attack. 

The paper is organized as follows. Some of definitions and assumptions are 
given in Section 2. Protocols of undeniable signature scheme proposed by Gen- 
naro et al. is given in Section 3. Some problems of their scheme is given in Section 
4. Our new undeniable signature scheme is given in Section 5. 



2 Some of Definitions and Assnmptions 

In this section, we summarize some definitions and assumptions about security 
of cryptosystem. They are necessary to guarantee the security of protocols in 
this paper. 



2.1 Factoring Problem 

In this paper, we define “large primes p, g” satisfying following assumptions. 

1. No Efficient Factoring Scheme 

Anyone who has no information about neither p nor g cannot calcu- 
late both p and g from n = pq efficiently. 

2. Impossibility of Exhausting Factoring 

Anyone doesn’t have calculating power to execute exhausting factor- 
ing of n. 
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2.2 Security Assumption of RSA 

Let p,q are large primes. Let n = pq,p 7 ^ q and L = LCM((p— 1), (g — 1)). Here 
we assume the limit of the calculation power as follows. 

Let e be a number satisfying the equation (e,L) = 1. Then anyone 
who knows only n and e cannot calculate d = mod L without the 
information about L. 

2.3 Proper Public Information 

In this paper, anyone can publish any types of public information properly, i.e., 
he can provide his public information to others without rdtering and lacking. 
This assumption is useful to realize PKI(Public Key Infrastructure) systems [ 8 ]. 

2.4 Proof of the Validity of n 

We define p,q,p',q' as distinct large primes satisfying 

p = 2p + l,q = 2q' + 1. (1) 

In this paper, we assume that anyone who publish only n{= pq) can prove 
that n is the product of large primes p, q and p, q satisfy the equation ( 1 ) without 
publishing any information about p and g. This assumption is useful to realize 
the Zero-Knowledge proof system [9]. 

3 Gennaro-Krawczyk-Rabin’s Undeniable Signatnres 

In this section, we summerize the undeniable signature scheme proposed by R. 
Gennaro, H. Krawczyk and T. Rabin [5]. 

3.1 System and Key Generation 

Gennaro et rJ. proposed an undeniable signature scheme based on RSA. In this 
scheme, the keys of each user are generated by the system that is trusted by the 
user. 

The system first generates the set M as, 

A/" = {n I n = pq, p < q, p = 2p' + 1, q = 2q + 1 {p, q,p , q : large primes) } 
and prepares following parameters for each of his users; 

1. Choose n eAf. 

2 . Choose e, d so that ed = 1 (mod 4>{n)) 

3. Choose {w,Syj) as w € Zn* , w ^ 1, Syj = w'^ (mod n). 

Then he opens (n,w,Syj) to the public as public keys and gives only the user 
(e, d) as private keys. 
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P : P rover 


V : Verifier 




1, Choose i,j where 1 < i, j < n at random 

Calculate Q : 

Q = SmSu,^ (mod n) 


2, Calculate A from Q 


--Q 


A = Q'^ (mod n) 


A -- 


? 

3, Check A = (mod n) 

If equcJity holds then V accepts Sm as P’s sig- 
nature for TO. 



Fig. 1. Gennaro-Krawczyk-Rabin’s Signature Confirmation Protocol 



3.2 Generation of Signatures 

The process for generating of signatures by this scheme is similar to that by the 
usual RSA signature. A signer prepares a message m which is the hash value of 
a plain text. Then he calculates the signature Sm as, 

Sm = (mod n). (2) 



3.3 Signature Confirmation 

The confirmation of signature needs the signer’s cooperation. This property al- 
lows the signer to restrict the verification of his signature. Then the receiver must 
execute a signature confirmation protocol to confirm vrJidity of the received sig- 
nature. 

We define a signer of a signature as P(Prover) and a verifier of the signature 
as V( Verifier). Assume that V has a (m, Sm), where P asserts that Sm is his sig- 
nature for the message m. Figure 1 shows Gennaro-Krawczyk-Rabin’s signature 
confirmation protocol. 

If Sm is a valid signature for m, i.e., Sm = (mod n), then the value A 

calculated by P becomes 

A = Q" = {Sl.SJ)'^ = (mod n), 



so the equation in phase 3 in Figure 1 is right. 
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3.4 Denial of Signature 

Here we present the denial protocol of Gennaro-Krawczyk-Rabin’s scheme. This 
protocol is different from their signature confirmation protocol and is executed 
when a signer wants to prove that the signature shown by the verifier is false. 
Now we define the false signature 5^ of a message m as 5^ ^ (mod n). 

This protocol shows that 5„ isn’t the vrJid signature for m. So if a signer cannot 
execute this protocol, he cannot prove that 5^ is the false signature for a message 

TO. 

Assume that V has (to, 5^), where 5^ is considered as the P’s signature of 
the TO but P wants to show that 5^ is not a valid signature for to. Figure 2 
shows Gennaro-Krawczyk-Rabin’s denial protocol. 



P : Prover V : Verifier 

1, Choose b at random where 1 < b < k 
Choose j at random where 1 < j < n 
Calculate i, Qi and Q 2 as follows 
i = 4b 

A . . 

Qi = (mod n) 

Q 2 = (mod n) 



Qi, Q2 

2, Search i so that 

<3i • < 32 ^® = ( (mod n) 

3, If such i cannot exist, then 
Stop. Else 

A = i 

A 



4, Check A = i 

If equcJity holds then V rejects Sm 
as a signature of to. Otherwise, 
undetermined. 



Fig. 2. Gennaro-Krawczyk-Rabin’s Denial Protocol 



The size of k is at most 1024 = 2^^ because P specifies i by exhaustive search 
in phase 2. 

If P is dishonest and the signature Sm is valid, then the equation in phase 2 
becomes 

Q1Q2 ^ = 1 (mod n) 
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and is useless for finding i. The probability is 1/fc that P will succeed in finding 
such as i as t in phase 3. The probability of P’s cheating V is at most 1/2^°° 
when V executes this protocol at least ten times. 

On the other hand, though V doesn’t send the right VcJue Q, he cannot get 
any information from P because P can stop this protocol at phase 3. 

4 Some Problems of Gennaro-Krawczyk-Rabin’s Scheme 

In this section we point out some problems in Gennaro-Krawczyk-Rabin’s scheme. 

1. Convertibility 

The way of converting the undeniable signature into usual one is to open the 
private key e to the public. This approach makes all his previous and future 
undeniable signatures to usurJ signatures. 

However, the convertibility of the undeniable signature schemes based on 
the discrete logarithm problem includes to convert each of all undeniable 
signatures into usual ones individually. 

2. Resistance to the Hidden Verifier Attack 

Y. Desmedt and M. Yung pointed out existence of the hidden verifier in 
the undeniable signature [6]. We define the entity that isn’t allowed to verify 
the signer P’s undeniable signature as the hidden verifier H, and the entity 
that is allowed to verify that signature as the verifier V. H can verify P’s 
signature by threatening V to verify it for H, although P doesn’t accepts 
verifying request directly from H. 

M. Jakobsson, K. Sako and R.Impagliazzo proposed the protocol that can re- 
sist hidden verifier attacks[7]. However, Gennaro-Krawczyk-Rabin’s scheme 
doesn’t include this protocol. So we adapt their protocol to this scheme. 

3. Deterministic Denial Protocol 

The denial protocol of their scheme is based on ZKIP, so P and V must exe- 
cute the denial protocol many times. This means that they must have large 
costs of calculations and communications. However, an undeniable signature 
schemes based on the discrete logarithm problem with deterministic denial 
protocol exists[10]. 

5 New Scheme 

In this section we propose new undeniable signature scheme based on the fac- 
toring problem. 

5.1 Key Generating System 

In Gennaro-Krawczyk-Rabin’s scheme, the crJculation of key generation is ex- 
ecuted by the system. This means that each user can use this scheme without 
execution of it. It rJso means that the system has information of private keys 
which he has assigned his users. So all of his user’s private information are also 
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leaked, if the secret information of the system has been leaked. Then the system 
is required high security against some attacks. 

We consider another way to generate keys. In our proposal scheme, some 
programs executed by each user realize the system of key generation. This means 
that each user generates his keys by himself. 

The system first generates the same set M as the Gennaro-Krawczyk-Rabin’s 
scheme, 

A/" = {n \ n = pq, p < q, p = 2p' + 1, q = 2q' + 1 {p, q,p , q : large primes) } 
and each user generates following parameters. 

1. Choose n as n £ Af. 

2. Calculate L = LCM(p — 1, g — 1) = 2p'q' 

3. Choose two odd numbers e, d ,2 where 3 < e,d ,2 < L — 1 

4. Calculate di as di = (6^2) (mod L) 

5. Calculate d as d = did 2 (mod L) 

6. Choose (w,Syj) as w G Z^* , w 7^ 1, Syj = w'^ (mod n) 

Then the user opens [e,n,w, S^) to the public as public keys and keeps 
(di,d 2 ,d) secret as private keys. 

The relation of keys e, di, ^2 and d becomes 

= M (mod n), (3) 

i.e., the relation between the public key e and the private key d is the same as 
that between the key pair of the usual RSA and private key d is product of two 
keys di,d 2 - 

The cost of generating these keys is negligible. 

5.2 Generation of a Usual RSA Signature 

Our scheme can generate the same signature as that of a usurJ RSA because it 
has cJl of parameters used in the usurJ RSA. 

The generation and the verification of the usual signature for a message m 
are as follows. 

— Generation of the usual signature: 

5 = (mod n) 

— Verification of the usual signature: 

M = (mod n) 

5.3 Generation of an Undeniable Signature 

The undeniable signature 5„ for a message m can be computed by using the 
public key e and one of the private key di as follows. 

(mod n) 



Sm = m 



(4) 
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5.4 Signature Confirmation 

Here we will present a confirmation protocol of our undeniable signature scheme. 
We define a signer as P : the prover of his signature and a verifier as V : the 
verifier of P’s signature. A prover P must have some parameters which are defined 
in section 5.1 but V doesn’t have to generate them. V must calculate only riy 
which is the product of two large primes Pv,qv and open the value of ny to the 
public. This means that V’s parameters are allowed not only RSA’s one but also 
one of other RSA-type encryption schemes (like Rabin, Wiliams, and so on). 
Of course, V’s parameters belong to those defined in section 5.1. We define the 
parameters of P and V as follows. 





Public Information 


Private Information 


p 


^p; ^wp 


Pp 7 Qp 7 ^p 7 ^Ip ; ^2p 


V 


} ^wv ) 


Pv ; Qv ; , C?2v) 



The parameters in parentheses are not necessary to execute our scheme. 

V has already got where P has asserted that Sm is a signature 

for the message m. From the equation (4), the P’s valid signature Sm for the 
message m is 

Sm = (mod rip). 

Then Figure 3 exhibits our signature confirmation protocol. 

This protocol uses the equation as follows, 

(^M'^pdlp)d 2 p ^ ]^epdipd 2 p ^ J^epdp ^ (mod Tip). 

Now we consider some situations about this protocol. 

If Spn is valid, i.e., 5^ = rn'^vdip (mod np) in phase 2 of Figure 3, then the 
value a which P calculates becomes as follows. 

a = (mod rip) 

= Sm'''^^’’Syjp^d2p (^mod rip) 

= rn'-'^pdipd2p pp^jepdipd2p 

= m^Wpd (mod rip) 

In phase 3, P defines A as the odd one of ±a (mod n) and sends V the value C 
which is a random number x to the A-th power. V receives i and j after getting 
C. When x is one of {1, —1 or 0}, the equalities in phase 5 always holds. So x 
must satisfy 2 < x < {riy — 2) and V must check that x sent by P satisfies the 
relation in phase 5. In phase 4, P checks the vrJue Q from i and j. If Q is right, 
then he sends A and x. In phase 5, V verifies the signature to check C and A^ . 
The reason of checking A^ instead of A is to satisfy the relation both the value 
±a which is selected by P. 

Next we consider the case that dishonest P wants to deceive verifier V, i.e., 
P wants to deceive V into recognizing invalid signature ^ ppi^pdip (^mod rip) 
as valid signature for a message m. At first, P signs another message m' and 
generates its signature Sm = m''^^d^v (mod rip) where m ^ m'. Then he asserts 
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P : P rover 



2, Calculate A from Q 



a = Q^^ 



A = 



(mod rip) 
a (if a is odd) 
rip — a (if a is even) 

3, Choose a value x at random 
where 2 < x < (riy — 2) and Cal- 
culate C as follows 

A 

C = x^ (mod Uy) 



Q 



c 



V : Verifier 

1, Choose i,j at random where 
1 <i,j < n 

Calculate Q 

Q ^ (mod rip) 



4, If Q ^ Sl^Syyp^ (mod rip) 
then Q,i,j are incorrect and 
Stop 

A, X 

? 

5, Check C = x^ (mod Uy) 

A^ = {m'‘Wp^Y (mod rip) 

If both equalities hold then V ac- 
cepts Sjn as P’s signature for m. 



Fig. 3. Signature Confirmation Protocol 



V that Syn is the signature for m. V wants to execute this confirmation protocol 
to check P’s assertion. V chooses i and j and sends Q = S^^Syjp^ (mod rip). P 
can calculate a = (mod rip) as follows. 

a = 

= {m'^ 

= m'^w^ (mod rip) 

On the other hand, he must receive the value A as follows, 

A = ±m^Wp^ (mod rip) 





144 T. Miyazaki 



to pass the check of this protocol. However, he cannot calculate such A because he 
cannot specify both i and j from Q,m and m'. So no one can deceive any verifier 
into recognizing invalid signature Sm as a valid signature with this protocol. 

In addition, P doesn’t rJso select appropriate values C,A and x to pass the 
check in phase 5 because the calculation of C demands both A and x and the 
calculation of A demands both i and j . However, P must send C before he gets 
i and j. So he cannot calculate these appropriate values. 

Next we consider a situation that V is dishonest. In this case, the dishonest V 
cannot request incorrect vrJue Q because if Q isn’t true value, i.e., Q ^ S^^Syjp^ 
(mod rip), then P finds this fact from the check in phase 4. 

Finally, we consider another situation that a dishonest V wants to generate 
prover P’s signature for another message m by using this protocol. The relation 
between Q and A is 

A^ = (mod rip) 

^2epdip ^ Q2epdipd2p ^ q2 

i.e., Q is the signature for either of ±H. If one of ±A is a message m, then V 
can assert Q as the P’s signature 5„. However because of the following reason, 
it is difficult for V to crJculate such a Q. From A = m, we can show that 

m^Wp^ = m (mod rip) 

so that if V can specify both i and j satisfying this condition, he can calculate 
Q as Q = SmSyj^ (mod n). However, specifying these values is as difficult as 
the discrete logarithm problem. So no one can fabricate any prover’s signature 
by using this protocol. 

Generally the commitment scheme used in this protocol has indicated an 
attack using a product lots of numbers. The probability that this attack has 
effects is decided by #(A), i.e., the number of different H’s. We have 

m) = ^ 

because A is an odd number in 1 < H < rip. So in this protocol, it is easy to see 
that the probability is very small, i.e., this attack isn’t serious menace to break 
our proposal scheme. 



5.5 Denial of Signature 

Here we present the denial protocol of our scheme. This protocol is different from 
the signature confirmation protocol shown above and is executed when a signer 
wants to prove that the signature shown by verifier is false. Now we define the 
false signature of a message m as ^ (mod n). This protocol shows 

that Sjn isn’t the signature for m. So if a signer cannot execute this protocol, he 
cannot assert that 5^ is the invrJid signature of a message m. Figure 4 shows 
our denial protocol. 
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Signature confirmation protocol 
using the parameters {ii,ji,Qi,ai,Ai,xi,Ci} 
as {i,j,Q,a,A,x,C} in Section 5.4 

Verifier V checks that the signature is invalid on these parameters as follows. 

? 

1-1, Check If Cl ^ xi^^ (mod riy), 

then Ai,xi are incorrect and Stop. 

1-2, Check If Ai^ = (mod rip), 

then the signature is valid and Stop. 



Repeat of signature confirmation protocol 
using the parameters {*2, ^2, <?2, 02, ^2, ®2, <^2} 
as {i,j,Q,a,A,x,Cj in Section 5.4 

V checks that the signature is invalid on these parameters as follows. 

? 

2-1, Check If C 2 ^ X 2 "^^ (mod n^,), 

then ^ 2 , 2:2 are incorrect and Stop. 

2-2, Check If A. 2 ^ = (mod rip), 

then the signature is valid and Stop. 

Final Check 

V checks that these confirmation protocols are executed correctly as follows. 

3, Check If (AiWp^^^)^’'^ = {A 2 Wp^^^Y^^ (mod np), 

then the signature is invalid. 

Fig. 4. Signature Denial Protocol 



This protocol is based on Chaum’s undeniable signature[10]. We improve 
this protocol to adapt Jakobsson-Sako-Impagliazzo’s one [7] to resist the hidden 
verifier attacks. 

This protocol consists of three parts. In the first and second part, a prover P 
and a verifier V execute the signature confirmation protocol twice by changing 
random values. Then in the last part, V checks that P executes them correctly. 

In this protocol, the signature confirmation protocol is used to check whether 
the signature is invalid or the protocol stops if the signature is valid. 

If this protocol is executed correctly, the equality in phase 3 of the final check 
holds because 



{AiWp ={±S'^^^^Wp^^Wp -^1)2*" (mod Tip) 

= (mod Tip) 

{A 2 Wp^^^f^^ = (mod np) 
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= (mod Hp). 

On the other hand, when dishonest P wants to deny the valid signature Sm, 
he selects one of two cases the following. 

1. P ignores the values Qi,Q 2 

In this case, P selects (Ai,Ci,xi) and (A 2 ,C 2 ,X 2 ) which satisfy Ci = 
Xi^^, C 2 = X 2 ^^ ■ Then this protocol can execute checks of 1-1 to 2-2 and 
the equality holds in phase 3 of the final check when the equation 

^ 2 ^*^ = {AiWp^AY^^Wp^j^ (mod rip) 

is satisfied. However, P cannot specify A 2 because he cannot know *2 when 
C 2 is calculated in the second check. 

2. P uses the values Qi, Q 2 

In this case, P uses Ai, A 2 instead of correct Ai, A 2 as follows. 

Ai = (mod rip) 

= Zim^^Wp^^ (mod rip) 

A 2 = Z 2 Q 2 '^^'’ (mod rip) 

= Z 2 m^^Wp^^ (mod rip) 

{Zi, Z 2 7 ^ 0 , 1 ) 

Then this protocol also can execute checks of 1-1 to 2-2 and the equrJity 
holds in phase 3 of the final check when the following equation is satisfied. 

(iiWp^-^^)^*" = (i 2 Wp^-^")^*^ (mod rip) 

(Zim*^)^*^ = (^ 2771 *^)^*^ (mod rip) 

^^ 2 i 2 ^ Z2^il 

The order of any number on modulo rip must be one of {1, 2,pp , 2pp , qp' , 2qp , 
Pp'qp' , 2pp'qp'} [5]. Then if the order of either Zi or Z 2 is bigger than 2, then 
it is hard to find Zi and Z 2 satisfying above relations. So the order of both 
Zi and Z 2 must be either 1 or 2. In this case, the protocol must stop at the 
check of both 1-2 and 2-2 because they check Ai^ , A 2 '^ instead of Ai,A 2 , 
and it always satisfies that both Ai = Ai and A 2 = A 2 on this case. 

These mean that the dishonest P cannot deny his vrJid signatures. 

5.6 Resistance to Hidden Verifier Attack 

We will show that in our scheme, the verifier can resist confirming any of unde- 
niable signature forced by hidden verifiers. We define a signer of the undeniable 
signature as P, a verifier whom P rJlows to verify it as V and a hidden verifier 
whom P doesn’t allow to verify it as H. H forces V to execute the signature 
confirmation protocol of it by using the vrJue decided by H instead of V and 
check C, A and x indirectly. Figure 5 shows a detail of this situation. 
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<— Signature confirmation — » <t= Coercion into verifying 

« — Q « — Q 

PC — » V C — » H 

(Prover) ^ (Allowed verifier) ^ (Hidden verifier) 

A, X — » A, X — » 



Fig. 5. Signature Confirmation by Hidden Verifier 



V : Allowed Verifier H : Hidden Verifier 

1, Choose two VcJues i,j at ran- 
dom where 1 <n 

Calculate Q as follows 

A ~ / 

Q = (mod Up) 

2, Choose a value C at random 
where 1 < C" < 

C 



* ,3 



3, Specify A' and x' as follows 
a' = TO* Wp^ (mod rip) 

( a' (if a' is odd) 

( rip — a' (if a' is even) 
x' = C'^ modL„ (mod riy) 
where Ly = LCM((pj, — 1), {q^ — 1)). 

A',x' 



4, Check C = x'^ (mod n^,) 

= (to* Wp^ Y (mod rip) 

(In this protocol, equalities of both 
equations rJways hold) 



Fig. 6. Resistance Protocol to Hidden Verifier Attack 



However, V can execute the resistance protocol in Figure 6. 

In this protocol, two equations H checking is always hold without regard to 
the signature is either vrJid or invcJid. V specifies x' as 

x'^ = C' (mod Uy) 

X = C'^ (mod Uy). 

This is equivcJent to specifying the private key corresponding to the public key 
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A' . Because A' is odd number, A' ^ mod Ly always exists except the Ccise where 
A' is a multiple of either p' or q' . 

From existence of this protocol, H cannot trust that V executes the signature 
confirmation protocol honestly as forced by H and confirm P’s signature. 

This means that this protocol can resist hidden verifier attacks. 



5.7 Conversion of the Undeniable Signature 

Here, we present the way to convert the our undeniable signature proposed in 
section 5.3 to a usual RSA signature presented in section 5.2. 

There exists an undeniable signature 5„ corresponding to a message m. If 
this signature is valid, then we have 

Sm = (mod np). 

When a signer wants to convert only the signature Sm into a usual RSA signa- 
ture, he shows the following converting information Cm as 

Cm = (mod np). (5) 

Anyone who gets the message M, the undeniable signature Sm corresponding 
to M and the converting information Cm corresponding to Sm can verify them 

by 

{CmSmT-^^M (modnp). 

When a signer wants to convert all of signatures he signed previously, he can 
select two ways. 

1. Open d 2 p to the public 

If d 2 p is opened to the public, then anyone can verify all of the undeniable 
signatures 5„ as 

, ? 

5„ = m (mod Up). 

2. Open C* to the public 

We define the converting vrJue C* as 

C* = dp - epdip (mod Lp) 

where Lp = LCM((pp — 1), (gp — 1)) . If the converting value C* is opened 
to the public, then anyone can verify all of the undeniable signatures 5„ as 
follows, 

r ’ 

{ra * SmY’’ = ra (mod rip). 
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6 Conclusion 

In this paper, we pointed out some problems in Gennaro-Krawczyk-Rabin ’s un- 
deniable signature scheme based on RSA. These were some restrictions on the 
signature’s convertibility, existence of hidden verifier and probabilistic denial 
protocol. 

We proposed new undeniable signature scheme by modifying the originrJ one 
to solve these problems. 

The key generation of our scheme made each signature convertible to a usual 
RSA signature individually. Both the signature confirmation protocol and the 
denied protocol were improved to resist to the hidden verifier by using Jakobsson- 
Sako-Impagliazzo’s protocol. 

The denied protocol was cdso improved to be deterministic one by using 
Chaum’s decisive denial protocol. 

Then we also showed the way to convert the undeniable signature to a usual 
RSA signature individually. 
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Abstract. Group signature schemes allow a group member to sign mes- 
sages anonymously on behalf of the group. In case of dispute, only a des- 
ignated group manager can reveal the identity of the member. During 
last decade, group signature schemes have been intensively investigated 
in the literature and applied to various applications. However, there has 
been no scheme properly handling the situation that a group member 
wants to leave a group or is excluded by a group manager. As noted in 
[2], the complexity of member deletion stands in the way of real world 
applications of group signatures and the member deletion problem has 
been a pressing open problem. In this paper, we propose an efficient 
group signature scheme that allows member deletion. The length of the 
group public key and the size of signatures are independent of the size of 
the group and the security of the scheme relies on the RSA assumption. 
In addition, the method of tracing all signatures of a specific member is 
introduced. 



1 Introduction 

The concept of a group signature was introduced by D. Chaum and van Heyst 
[7]. It allows members of a group to sign messages anonymously on behalf of the 
group. The signed messages are then verified by a group public key. In case of 
dispute, only a designated group manager can reveal the identity of the member. 
Various group signature schemes have been investigated to develop the efficient 
scheme of which the length of signatures and the size of the group public key are 
independent of the size of the group. Group signature schemes in [3, 7, 8, 10] do 
not satisfy the length requirement and/or the size requirement. Only schemes 
proposed in [1,4-6] satisfy both requirements. 

Group signature schemes should be coalition resistant. In other words, no 
subset of group members including the group manager is able to generate valid 
group signatures that are untraceable or from which the trustee revokes the 
identity of another group member. The schemes in [1, 4] are the provable coalition 
resistant group signature schemes. 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 150-161, 2001. 

Springer-Verlag Berlin Heidelberg 2001 
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With the improvement in both efficiency and security of group signature 
schemes, the entire concept of a group signature scheme is brought to various 
applications such as an electronic cash system, a bidding, and a voting. However, 
for group signature schemes to be adapted to real applications, a few problems 
need to be solved. Among them one of the most important things is the efficiency 
in member deletion. 

In practical applications, a group is dynamic, i.e., membership changes fre- 
quently. A group member may voluntarily leave the group by various reasons 
such as promotion in position. If a group member does a dishonest thing, a 
group manager must exclude him from the group. In this case, it may need to 
trace all of signatures generated by him while the anonymity of the others must 
be preserved. As stated in [2], no proposed group signature scheme adequately 
addresses the member deletion problem and efficient and secure member deletion 
has remained a pressing and interesting open problem. 

In this paper, we propose a new group signature scheme which allows member 
deletion and sign-tracing generated by a specific member. Our scheme is based on 
Camenisch and Michels’ Group Signature Scheme [4] that adds a member deletion 
procedure. Whenever a member joins or leaves the group, public information and 
each member’s secret are modified without re-issuing membership certificates. 
Each modification requires only one modular multiplication. Hence our model is 
an acceptable solution for a large group where membership changes frequently. 
The group public keys, member’s secret key, and the signature are all of constant 
size. The computational complexity of registration and deletion of a member is 
linear in group size, but the computational burden is decentralized, i.e., each 
member updates his own key. The computational complexity of other procedure 
is independent of group size. 

This paper is organized as follows. Section 2 presents the model that per- 
mits deletion/addition of a member in group signature schemes and the security 
requirements. Section 3 describes the assumptions on which the security of the 
proposed group signature schemes is based. We develop our scheme in Section 
4 and analyze the security of our scheme in Section 5. Finally, we conclude in 
Section 6. 

2 The Model and our Approach 

This section describes the model that allows member deletion in a group signa- 
ture scheme, the security requirements, and the approach of our proposal. The 
main difference between our model and other models previously proposed is that 
our model has a member deletion procedure and sign-tracing capability. 

2.1 The Model 

A group signature scheme consists of the following procedures: 

Setup : An interactive protocol between the membership manager and the re- 
vocation manager. The outputs are the membership manager’s secret key 
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Xm and public key j/m , and the revocation manager’s secret key xr and 
public key i/r. 

Join : An interactive protocol between the membership manager and a user 
that results in the user becoming a new group member. The outputs are a 
group member’s secret key xg, a group member’s public key yc, a group 
member’s secret property key Ug, the group’s public property key Um and 
the group’s public renewal property key Un. 

Delete : A member deletion algorithm that on input a member’s public key yc 
outputs the group’s public property key Um and the renewal property key 
Un- 

Sign : A signature generation algorithm that on input a message m, xg, VG: Um, Vr, 
and Ug outputs a signature a. 

Verify : A verification algorithm that on input a message m, a signature cr, 
UMjVr, and Um return 1 if and only if a was generated by a proper group 
member using Sign on input m,XG,yGiyM,yR and Ug- 

User- Tracing : A user tracing algorithm that on input a signature a, a message 
m, xr, and yR outputs the identity of the group member who generated the 
signature a. 

Sign- Tracing : A sign tracing algorithm that on input a part of a signature a, 
yG, and xr outputs 1 if and only if the signature was generated by a specific 
member. 



The followings are security requirements: 

Unforgeability of signatures : Only current group members are able to gen- 
erate valid signatures. Furthermore, the signature can be user-traced and 
sign-traced by the revocation manager in need. In particular, if a group 
member leaves a group, he cannot generate a valid signature any more. 

Anonymity : It is infeasible to find out a member who generated a given sig- 
nature except the revocation manager. 

Unlinkability of signatures : Given two signatures, no one except the revo- 
cation manager can decide whether the signatures have been computed by 
a same group member. 

No framing : Any coalition of group members, the membership manager, and 
the revocation manager cannot compute a signature on behalf of non-involved 
group member. Futhermore, they can not sign message on behalf of a deleted 
group member. 

Unforgeability of user-tracing verification : Given a signature, the revo- 
cation manager cannot falsely blame a signer for having produced the sig- 
nature. 

Unforgeablility of sign-tracing verification : The revocation manager can- 
not falsely insist that a signature was generated by a designated member. 



2.2 The Approach of Our Proposal 

The core idea to handle membership changes is as follows. For membership 
changes in the group, a membership manager maintains a group property key 
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Um and group renewal property key Un. When a new member joins the group, 
the membership manager issues a secret property key Uq to the new member. 

For each registration or deletion, the membership manager updates the group 
public property key Um and the group public renewal property key Un, and 
publishes Um and Un- Each remaining member renews his secret property key 
Uq using the updated group renewal property key Un and checks the validity 
of his new key using the updated group property key Um- And a group member 
uses his secret property key Uq to sign a message. Someone who wants to verify 
a signature must make use of Um- It is computationally infeasible for a deleted 
member to generate a valid signature using the group renewal property key and 
his old secret property key. 

As explained above, the membership manager is only involved in issuing 
a new member’s secret key and the secret key of each current member is re- 
generated by himself. Hence computations needed in membership changes are 
distributed to each member. 

3 Preliminaries 

In this section we describe the cryptographic assumptions necessary in the sub- 
sequent construction of the proposed group signature scheme. Our scheme is 
based on the group signature scheme by Camenisch and Michels [4] with member 
deletion capability. We briefly describe their scheme in this section, but do not 
explain the building blocks for the group signature scheme described in Section 

4 of [4] . Those building blocks are proof systems in which one party can convince 
other parties that he knows certain values without leaking useful informations. 
Since they are used for the purpose of both singing a message and providing 
knowledge of a secret, they are called “signatures based on a proof of knowl- 
edge”, SPK for short. Whenever SPK is used in the rest of the paper, what it 
proves is explained without details. For more details, refer to [4] and [9]. 

Let Ig be a security parameter and G be the group of order with length Ig is 
factored into two primes of length {Ig — 2)/2. 

Problem 1- (RSA Problem). Given G and {z,e) G G \ {±1} x Z, find u G G 
such that u® = x. 

Let T denote a key-generator that on input fG outputs a G and az G G\{±1}. 

Assumption 1 (RSA Assumption). There exists a probabilistic algorithm T 
such that for all probabilistic polynomial-time algorithms A, all polynomials p{-), 
and all sufficiently large Ig, 

Pr[z = u^ I (G,^,e) ■-T{f^),u:= A{G,z,e) ] < 

P\w) 

The following two assumptions are due to [4]. They proposed the Modified 
Strong RSA assumption which is the modification of the strong RSA assumption 
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by Fujisaki and Okamoto [9]. Let k, h, I 2 < Ig and £ > 1 be security parame- 
ters. Denote I := e{l 2 -I- fc) -I- 1. Let be M{G, z) = { (u,e) \ x = u^, u G G, e G 
{2*1 — 2*2, • • • , 2*1 -I- 2 * 2 }, e : prime } where z £ G. 



Problem 2. (Modified Strong RSA Problem). Given G, z £ G, and M C 

M{G,z) with \M\ = 0{lg), find a pair (u, e) G G x Z such that = z, e £ 
{ 2*1 - 2*', • • • , 2*1 -k 2*'}, and (u, e) ^ M. 



Assumption 2 (Modified Strong RSA Assumption). There exists a prob- 
abilistic algorithm T such that for all probabilistic polynomial-time algorithms A, 
all polynomials p{-) , all sufficiently large Ig, all M C At(G, z) with \M\ — 0 {lg) 
and suitably chosen I1J2A and s, 

Pr[z = u^ A e G {2*1 - 2*', • • • , 2*1 -k 2*'} A {u,e) M \ {G, z) := T{\G), 

(u,e) := A{G,z,M) ] < 



As noted in [4], if two pairs {u, e), {u' , e') with z = = u'^ , and gcd(e, e') = 

1 are known, it is easy to find an element u satisfying 2 ; = using the ex- 
tended Euclidean algorithm. But ee' does not satisfy the range constraint since 
ee' ^ {2*1 — 2*, •••,2*1 -k 2*} for suitable chosen parameters lg,li,l 2 ,e, and k. 
Therefore, group signature schemes based on the modified strong RSA assump- 
tion are coalition resistant. The Modified Strong RSA Problem is at least as 
hard as Strong RSA Problem from [4] and [11]. 

Besides the Modified Strong RSA Assumption, Camenish and Michels’ group 
signature scheme relies on the Diffie-Hellman Decision(DHD) assumption. To 
state this assumption. We define the two sets 

DH{G) := { (51, 2/1, 52, 2/2) G G"* | ord{gi) = ord{g2) = n', logg^yi = logg^y2 }, 
Q(G) := { (51, 2/1, 52, 2/2) G G '^ I ord{gi) = ord{g2) = ord{yi) = ord{y2) = n'} 
with n'|0(G) and jn'j = Ig — 2. 



Assumption 3 (Diffie-Hellman Decision Assumption). There exists a prob- 
abilistic algorithm T such that for all probabilistic polynomial-time algorithms 
A, and all sufficiently large Ig, the two probability distributions 

Pr[ a = 1 I G := T(l*«), K £r DH{G), a := A{K) ] 

and 

Pr[a=l\G-.= T(l*«), K £r Q(G), a := A{K) ] 
are computationally indistinguishable. 
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Consider the DHD assumption in the case G = Z* where n is an RSA- 
modulus. In this case the DHD assumption does not hold in general since the 
Jacobi-Symbol leaks information about logg^yi and logg^y 2 for some gi, 52 , J/ij 2/2 G 
G. For example, if (51 |n) = {g 2 \n) = (y 2 \n) = —1 and {yi\n) = 1, then 
foffgil/i 7 ^ logg^y2- 



Note 1. In the case G = Z* where n is an RSA-modulus, if G =< 5 > is defined 
to be a subgroup of Z* with {g\n) = 1, then the DHD assumption holds. 

4 The Proposed Scheme 

This section describes our new group signature scheme that allows member dele- 
tion and sign tracing. The security of the proposed scheme is based on Assump- 
tion 1, 2 and 3. In particular, the security of property keys replies on Assumption 

1. The scheme is especially described in the viewpoint of addition/deletion of a 
member. 



4.1 System Setup 

In our scheme, the membership manager supervises the group members and the 
revocation manager as a trustee performs tracing protocols while guarantees 
anonymity of legitimate members. They first set up the system with generating 
the group’s public keys and choosing their secret keys. 

The membership manager executes the setup procedure as follows : 

1. Choose a group G =< g > and two random elements z,h G G with the 
same (large) order (« 2 ^) such that Assumption 2 and 3 hold. Computing 
discrete logarithms in G to the bases g, h, or 2 ; must be infeasible and only 
the membership manager can easily compute these roots. That is, he should 
keep the order of G secretly. 

2. Choose two large random primes p and q (« 2 ^/^) of the form p = 2p'-|-l, q = 
2q' + 1 where p' , q' are primes, such that p,q^ I (mod 8) and p^ q (mod 8). 

3. Keep p and q secret and publish n — pq. 

4. Choose a public key Cn and a secret key djv such that CNdN = 1 (mod <j){n)) 
where n is a RSA-modulus. 

5. Publish z, g, h, G, Cn and Ig and prove that g, h and 2 : have the same order. 

The revocation manager executes the setup procedure as follows : 

1. Choose his secret key xr randomly in { 0 , • • • , 2 ^ — 1 }. 

2. Publish pr — g^^ as his public key. 

Also, the membership manager sets up a hash function H : {0, 1}* — > {0, 1}*’ 
and security parameters IJijh and e. 
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4.2 Join 

This is an interactive protocol between the membership manager and Alice who 
wants to become a new group member. Alice chooses a prime xq randomly in a 
appropriate range. She keeps xq secretly. The membership manager extracts a 
element yc & (j such that = z holds. This pair (xc,yG) is the membership 
key of Alice. Also the membership manager regenerates group’s public property 
key Um and renewal property key Un using yc and generates Alice’s secret 
property key Ug- Each regeneration- value of the group property key and renewal 
property key is published. Before generating any signature, current members 
check whether the group renewal property key has been updated or not. 

Alice does the following : 

1. Choose random primes xg,xg &r — 1} such that xgXg 

1 (mod 8) and xg ^ xg (mod 8). 

2. Compute xg '■= xgXg and .2 := . 

3. Commit to xg and 2 . 

4. Send ic, z and their commitments to the membership manager. 

5. Execute the interactive protocols corresponding to 
W SPK{ (r, g) \ z^'=^ = z'^ A z = z^ A 

(2^1 _2<=(*2+fe)-i-i < r < 2*1 _l_2'=(L-i-fc)+i^ 

with the membership manager. 

W is a statistical zero-knowledge proof of knowledge of the discrete loga- 
rithm of z(= z^'^) and an integer xg such that xg G {2*i — • • • , 2*i -|- 

2 '’(* 2 +*i)+i} and z^‘^ = z^^. Therefore the membership manager trusts Alice to 
have chosen xg and 2 correctly by the proof W. 

Let C := {Gi, G 2 , • • • , Gm-i} be the set of current group members and Gm 
be a new member, Alice. Let yGi denote each member’s public key. Before adding 
Alice to the group, the group’s public property key is Um ■= yci ' ' ' UGm-iV' with 
a random number y' Gr G. The membership manager does the followings : 

1. Generate Alice’s public key j/g™ := z^^^^ ■ 

2. Compute a new group’s public property key Um '■= yCi ' ' ' yGm-iUGmy" 
where y” Gr G. 

( // \ djsj 

4. Generate the member G^’s secret property key Ug^ '■= {yGiVG^ ' ' ' yGm-iV''^^ ■ 



The membership manager publishes Um and Un, and sends a member’s pub- 
lic key j/Gm and a secret property key Ug^ to Alice. The pair {xGm , yGm) becomes 
the membership key of Alice. A new member Gm, Alice, verifies her public key 
2/G™ and secret property key Ug^ by checking y"g^ = z and (UG^Y^yG,,, = Um 
respectively. Each valid group member Gt{l < i < m — 1) except a new member 
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G™ changes his secret property key Gg, := (t/Ci • • • 2/Gi_i2/Gi+i 
into Uoi = Uoi • Un, that is, 

Uoi = (2/G1 • • • yGi-rVGi+i ■ ■ ■ yGrr.-iv'Y'' ■ {yGrr^y" ly'Y’' 

= (2/G1 ■ ■ ■ yGi-^yGi+^ • • •2/G„_i2/G„.y")'^"- 

Each group member can check new value Uci by computing Um = {UciY^yGi- 

4.3 Delete 

This protocol is similar to the addition of a group member. To delete the group 
member Gj the membership manager eliminates public key yc^ from the group 
public property key Um and changes a random number. The remaining group 
members change their secret property keys to generate a valid signature. 

Let the current group’s public property key be Um ■= yGi ' ' ' VG^y' where 
y' Gn G. The membership manager performs the deleting protocol as the follow- 
ings : 

1. Compute Um ■= Um ■ where y" Gr G, 

i.e., Um = yGi ■ • • 2/g,_i2/g,-+i • ■■yG^y”- 

( // \ 

■ 

3. Publish (Cm, Gat). 

Each group member Gi changes his secret property key Gg, into Uoi = Uoi ■ Un, 
for example, 

Ug, ■- ( 2 /G 1 • • • 2/Gi_i2/Gi+i • • • yG„,y'Y^ ■ iy" /yGiV'Y^ 

'■= ( 2 /G 1 • ••2/Gi_i2/Gi+i • ■■yGj-^yGj+^ ■ ■■yGmY'Y'', for i < j. 

Each group member verifies new value Uq by checking if {UcY^yG = Um- 



4.4 Sign 

First, We define a group signature. 



Definition 1. Let e, h and I 2 he security parameters such that e > 1, h <h < 
Ig, and I 2 < — k holds. A group-signature of a message m G {0, 1}* is a 

tuple {c,si,S2,sz,a,b,d,a,j3) G {0, 1}*’ x {— 2'=+'^, • • • , x 

|_2G+^i+fc^ . . . ^ 2*’(G+^i+fc)} X {— 2G+^=, • • • , 2'’(G+'=)} x G® 



satisfying 






Remark 1. Such a group-signature would be denoted 



C = SPK{{9,\p) ■ z = h<^!y^ A l = a^/g^ A a = gf^ A d^g^hf^ 
A !3 = yfYi^^^ A (2'i -2"('=+'=)+i < 6» < 2'i -b2'’('2+'=)+i) 
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The non-interactive protocol corresponding to £ is a statistical zero-knowledge 
proof of knowledge of the discrete logarithm of a and an integer xq such that 
XG G { 2^1 - • • • , -h and yc = ■ 

To sign a message m G {0,1}*, a group member does the followings : 

1. Choose an integer w G« (0, 1}*», compute a := g^, b := yaVR, d := g^'^h^ , 

a ■- Uah'" , and /? := . 

2. Choose ri Gr (0, IjdG-i-fc)^ ^2 Gr (0, l}dC+L-i-fc) ^nd rs Gr (0, l}dC+fc). 

3. Compute ti := ¥^{l/yRY^ , t 2 ■= a’^^{l/gY^, ts := t 4 := and 

h :=yYh^^^^. 

4. Compute c := H{g\\h\\yR\\z\\a\\b\\d\\P\\ti\\t2\\t3\\U\\t5\\m). 

5. Si := ri — c{xg — 2*^) (in Z), S 2 := T 2 — cwxq (in Z), and S 3 := rs — cw (in 

Z). 

The resulting signature on the message m is (c, Si, S 2 , S 3 , a, b, d, a, ( 3 ). 



4.5 Verifying Signatures, User- Tracing, and Sign- Tracing 

The verification procedure is an extension of the verification procedure in [4] 
that adds the check that the signature is verified by the group property key used 
at the date when the signature was generated. User-tracing procedure in [4] is 
unchanged and sign-tracing is newly added in our scheme. For a given signature, 
sign-tracing decides whether a designated user generated the signature. It may 
be viewed as the concept which is similar to coin-tracing in electronic cash sys- 
tems. Note that for both verification and sign-tracing, the history of old group 
property keys with the dates updated and the date when a signature was gen- 
erated should be available. This can be resolved by keeping the history of group 
property keys and the updated dates in a table and embedding the generated 
date in a signature. The size of signatures is still constant and a table look-up 
takes O(logm) only for the table size m. 

Verifying Signature : Given a signature, it is verified that the signature sat- 
isfies the verification condition given in Definition 1. If it is satisfied, a verifier 
trusts that the signer had a valid membership key, chose a random number w 
honestly, and formed /3 = Then he checks if P = b holds. This 

equality holds if and only if the signature was generated by a valid group mem- 
ber of the group. 

User- Tracing : To reveal the originator of a given signature a = (c, Si , S2 , S3 , a, b, 
d, a, P) of message m, the revocation manager first checks its correctness and 
then computes y^ := b/a^^. For the proof of unforgeability of user-tracing, 
he issues a signature P := SPK{ (p) : yR = gP A b/yQ = }( 2 /Gl|cr||m) and 

reveals arg := This SPK shows the equality of two discrete logarithms 

yR and b/yQ, and it is a statistical zero-knowledge proof of knowledge of the 
discrete logarithm of yR{— g^^)- He looks up in the group- member list and 
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Sign- Tracing : To find whether a signature a = (c, si, S2, S3, a, 6, d, a, /?) was 
generated by a specific (illegal) member, the membership manager sends (a, a, 
(3) to the revocation manager where yc is a specific member’s public key. The 
revocation manager computes -aY^ / {(3 / and checks if the result equals 

to Um- If the signature was generated by the member, he sends 1 to the mem- 
bership manager. In case that the signature was not generated by the member, 
the revocation manager cannot acquire any information except that the member 
didn’t generate it. 



5 Security Analysis 

We discuss the security of the proposed scheme. The following theorem implies 
that non-group member or a deleted group member with his obsolete secret key 
cannot generate any valid signature by showing that forging a valid signature is 
equivalent to solving the RSA problem. 

Theorem 1. There exists a probabilistic polynomial algorithm that on input 
Vr, UGj d, Um and Cn outputs (w,a) satisfying {^Yn) P = b where j3 = 
and b — ycVR */ and only if it is able to solve the RSA problem. 

(Sketch of Proof) Suppose that given y/j, ya, h, [/m and Cat, a probabilistic 
polynomial-time algorithm A can find a valid {w, a) such that (3 = b. Then 

we have the following. 



Um!3 = 6a®". 


(1) 


From (1), we have the following equation. 

{UmIvgYI^^ = a/Ii’®. 


(2) 



This implies that A finds a value {w, a) satisfying (2). ( Since Um '■= Vi • • • UmU' 
where y' Gr G, Um is able to be regarded as a random value in G and hence 
UmIvg is random in G.) 

Therefore, given a value m®" with m G G substituting nf^ for {Um/ug) in 
equation (2) results in 

(j^eN)l/e«- _ (xjti™ ^ (3) 

Thus we find m = ajhU and can solve the RSA problem. 

Conversely, if there exists an algorithm A which can solve the RSA problem, 
for tc Gk G A outputs a on a input pair e^v) such that UmA = 

is(^)/3 = 6. □ 
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In the rest of this section we only discuss unforgeability of signatures and 
sign-tracing verification. It is straightforward to check that other security prop- 
erties are satisfied in our scheme as well as in the scheme in [4] which our scheme 
is based on. Hence those are omitted here. 

Unforgeability of Signatures : Only the valid group members can generate 
valid signatures which will be able to be user-traced and sign-traced by the 
revocation manager. (Under the Assumption 2, it is infeasible that anyone with- 
out the information of the group’s order computes the valid membership key. 
Therefore only the membership manager can generate the membership key by 
executing the join protocol. So only the group members who execute join pro- 
tocol with the group manager can generate the valid group signs. Furthermore, 
the revocation manager is able to find the public key of a signer by decrypting 
(a, b) of a signature using his private key and computing = b/a^^ [4].) Due to 
Theorem 1, it is infeasible that the group member who left the group generates 
the valid signatures. 

Unforgeability of sign-tracing verification: Given (a, /3), if the revo- 

cation manager returns 1 to the membership manager, the membership manager 
computes a^’^b/UM^- This value is 1 if and only if the revocation manager has 
executed the sign-tracing correctly. 



6 Conclusion 

The complexity of member deletion of group signature schemes has been an 
obstacle in applying the concept of group signatures to real applications. In 
this paper, we proposed the first efficient group signature scheme that allows 
membership deletion. This scheme can be viewed as an extension of the scheme 
proposed in [4] that adds a member deletion procedure. For each deletion, our 
scheme requires two modular exponentiations, two modular inversions and few 
modular multiplications by the membership manager and only one modular mul- 
tiplication by a member without re-issuing membership keys. For the verification 
of signatures or sign-tracing, besides the computations needed in [4], one table 
look-up is performed. This takes O(logm) for table size m. And the length of 
the group public key and the size of signatures are constant. 

Our scheme is based on the specific scheme proposed in [4] and not applicable 
to other group signature schemes. Furthermore each group is time-stamped and 
the history of old group public keys should be available. Further research is 
required for more efficient and generic group signature schemes. 
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Abstract. It is commonly acknowledged that customers' privacy in 
electronic commerce should be well protected. The solutions may come 
not only from the ethics education and legislation, but also from 
cryptographic technologies. In this paper we propose and analyze a 
privacy protection scheme for e-commerce of digital goods. The 
scheme takes cryptography as its technical means to realize privacy 
protection for online customers. It is efficient in both computational 
cost and communication cost. It is very practical for real e-commerce 
systems compared with previous solutions. The cryptographic technique 
presented in this paper is rather simple. But the scheme has great 
application potential in reality. We give careful security analysis to the 
scheme. 



1 Introduction 

Electronic commerce is growing with a surprising speed. We have seen various 
predictions on the amount of transaction revenues coming from e-commerce by the 
year 200X. The figures are really impressive. It seems that e-commerce will be a hot 
topic for at least the next decade and will penetrate into everyone's daily life. 

Privacy protection in e-commerce is emerging as a commonly concerned issue. 

Privacy protection may refer to different problems in different backgrounds or in 
different applications. Currently the problem talked most often is that some e- 
merchants may collect and their customers' personal information and deliver it to 
others for business purpose. The solutions for this problem may come from some kind 
of legislation, as done or being done in some countries, or from certain ethics 
education as suggested in [7]. 

Another problem related to privacy is about anonymity, which interests both 
industry and academic. We have noticed that industry goes pretty fast in this area. 
One example is that Zero-Knowledge System Inc, a Canadian company, presents 
various solutions for anonymity in online activities such as web surfing, FTP and 
online chatting etc [8]. There are several new companies, the so-called anonymizers, 
providing such services. Chaum's anonymous digital cash [6] is an effort from 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 162-170, 2001. 

Springer- Verlag Berlin Heidelberg 2001 
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academic. It is a nice mathematical work and it has been followed by a large number 
of good research papers in the past decade. Unfortunately, it seems that anonymous 
digital cash did not take fly in practice. 

In this paper, we consider a different privacy protection problem. The scenario of 
our privacy problem is the e-commerce of digital goods. Consider the situation where 
an online customer buys an e-book or e-journal. The customer may not wish to 
disclose which book or which journal he is buying since it may, in some sense, reveal 
his/her favor or habit. Such a privacy protection is extremely hard for physical goods. 
But for digital goods, there do exist techniques to fulfill such a requirement. As long 
as the techniques are efficient and add no big additional cost, the e-merchants may be 
willing to exploit them in order to attract customers. Let us look at it from another 
angle. If two merchants sell the same e-goods (at the same prices) but one provides 
the privacy protection while the other does not, the customers definitely would like to 
buy from the first merchant. This is especially true if the e-goods are more or less 
sensitive. Examples of such goods may include online video, online music, digital 
pictures, digital maps(or travel guidance packages of different cities), patent drafts, e- 
books, e-journals, e-news, game programs etc. Buying patent drafts or other 
IP(intelligent propertyj-related information may be in B-to-B category, for which this 
privacy protection problem is even more important since it may directly relate to 
business secret. 

In this paper we present and analyze an efficient and practical scheme for the above 
addressed privacy protection. The paper is organized as follows. In Section 2, we 
describe the system architecture of our privacy protection scheme and display its 
merit features. In Section 3 we discuss some other issues related to the scheme such as 
authentication, payment, non-repudiation and copyright protection. In Section 4 we 
present the cryptographic technique exploited in the scheme and address some 
previously related work. In Section 5 we give security analysis to our scheme. 



2 Description and Features of the System 

In our scheme we exploit two symmetric key cryptosystems, denoted by E and CE, 
respectively. E is a traditional symmetric key cryptosystem such as DES or AES. We 
denote the ciphertext of m with key k by E(k, m), and the decryption of ciphertext c 
with key k by E’*(k, c). 

CE is a commutative encryption algorithm that satisfies the property: for any two 
keys and ^2 and any message m, 

CE{ku CE(k 2 , m))=CE(k 2 , CE(ki, m)). 

We will present a concrete CE in Section 4. The decryption of c with key k is 
denoted by CE'*(A:, c). 

For simplicity, we consider an abstract model where the merchant has n digital 
goods for sale. Denote them by Mj, M 2 , ..., M„. Suppose a customer want to get M, 
without the merchant knowing what i is. The procedure is as follows. 

The merchant randomly choose n secret keys of cryptosystem E, denoted by Ki, 
K 2 , ..., Kn, and a secret key S of cryptosystem CE. All these secret keys must be very 
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carefully kept by the merchant. Especially the secret key S is something like a master 
key, which must not be compromised. 

Denote 

Ci=E(Ki, Ml), Di=CE(5, Ki) 

C2=E(K^, M 2 ), D2=CE(S, K 2 ) 



C„=E(/f„, M„), D„=CE(5, K) 

Then the merchant puts <Ci, Di>, <€ 2 , D 2 >, ..., <C„, D„> onto a publicly 
accessible directory. Anyone is allowed to download anything freely and 
anonymously from the directory. 

The customer downloads <C„ D,>. Then he randomly chooses a secret key R of 
crytposystem CE and encrypts Di with R. Then he sends U=CE{R, D,) to the 
merchant. 

The merchant decrypts U with S and sends W=CE'*(5’, U) to the customer. The 
customer obtains Ki by decrypting W with R, Ki= CE'*(/?, W). Now he obtains M, by a 
further step of decryption M,=E'*(A'„ C,). 

The whole procedure is shown in the following figure. 




Here the public directory may be either from an Internet application service 
provider run by other parties, or the merchant's own public directory. We assume that 
the download of <C„ D,> can be anonymous. The assumption is realistic since when 
the customer downloads <C„ D,>, he need not show personal information such as 
membership or credit card number etc. If the download is through some specific 
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proxy, the customer's IP address can be hidden. Or if the download is through dial-up, 
the IP address changes every time. In addition, Zero-Knowledge System Inc. [8] 
provides some interesting technologies for anonymous download. 

The interaction between the customer and the merchant cannot be anonymous since 
the merchant must know whom he is dealing with. When the merchant decrypts U for 
the customer, the service is a charged service. Either the membership authentication 
or a payment is needed, which would disclose some information about the customer. 
However, by our scheme the merchant completely has no idea which digital good was 
bought by the customer. 

Our scheme has the following merit features. 

1 The merchant can never know what the i is, no matter how malicious the 
merchant performs. Of course the merchant can make the denial of the service by 
not returning W or returning a fake W. But he can never get the intention of the 
customer. Customer's privacy is perfectly protected. 

2 The customer can get at most one digital good in the implementation of the 
scheme once. This is very important for charged service. Also this requirement 
makes the scheme meaningful. Otherwise, let the customer get everything, 1 is 
always satisfied. 

3 Statistics is still possible. Although the merchant cannot know who gets what, he 
can still know which digital good has been download (not bought) how many 
times. Such kind of data collection is realistically demanded in e-business. 

4 The system is reliable. Even if some key K, is compromised or published by some 
malicious customer, the master secret key S and other secret keys are not affected 
at all. The merchant just needs to replace <C„ Z),> with a new K,. 

The above features will be discussed in more details in the section of security 
analysis, after we present the concrete CE in Section 4. 

Note. We do not consider the situation where the merchant displays wrong C, so that 
the customer cannot obtain correct M,. There is no solution at all if the merchant 
decides not to provide service to the customer. 



3 Some Other Issues 

Authentication 

We skipped the authentication details when we presented our scheme in Section 2. 
Authentication is necessary in some situations since the merchant is not likely to do 
the decryption operation W=CE *(5', U) for everyone. Authentication is especially 
necessary when the customer subscribes to the merchant for a membership. There are 
many ways to do authentication: public key solutions, secret key solutions, password- 
based solutions, and so on. Actually the crypto research community spent so much 
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effort on this topic in the past two decades. It depends on the concrete application 
situation to decide what kind of authentication scheme would be used in our scheme. 

Payment and Prices 

Payment can be in two ways. The first one is the membership. Owning a 
membership, a customer is allowed to have limited or unlimited access to the digital 
goods. In our scheme, the merchant will do a limited number of decrypting operations 
for the customer in the limited case. In the unlimited case, the merchant will do it 
whenever the customer requests. However, we recommend the limited model for our 
scheme since the merchant incures computation cost. 

The second way is pay-per-piece. For this kind of payment, the privacy can only be 
guaranteed among all the digital goods with the same price. That is, the merchant 
cannot know which good of that price the customer is buying. However, uniform 
price for digital goods is much more likely than for physical goods. For example, it is 
not impossible to have a same price for all online videos in an online video shop. 

Non-repudiation 

It is possible that a customer or a merchant is dishonest sometimes. When a dispute 
occurs between the customer and the merchant, there may need some other parties to 
judge who is cheating. The solution to this problem is just asking the customer sign 
the U and the merchant sign the W. In the process of resolving dispute, the merchant 
proves to the judge that W=CE’*(5, U) without disclosing S. In that case, the master 
secret key S is not given to the judge, which is a necessary protection to the merchant. 
Later we will show how to prove W=CE'*(5', U) without disclosing S after we present 
CE in Section 4. 

Copyright Protection 

Copyright protection is a deadly demanded feature in e-commerce of digital goods. 
However, it is a very difficult issue from technology viewpoint. So far there is no 
satisfactory solution. Watermarking technology is studied for the copyright protection 
of multimedia digital goods, which aims at catching the illegal copies instead of 
preventing them. The digital goods in the form of texts such as e-book and e-journal 
are hardly protected by watermarking. 

Another means is the so-called tamper-resistant software, which is still at its early 
stage. In tamper-resistant software technology, all the digital goods are encrypted and 
the decryption is done in the tamper-resistant software. The frame of our system 
meets the requirement of the tamper-resistant software protection very well. Tamper- 
resistant software technology can be well combined with our privacy protection 
scheme. 

Flexible Distribution Means 

The exchange or distribution of the encrypted digital goods among the customers is 
encouraged by the merchants. Another distribution means is by CD. It is estimated 
that CD with dozens GB and low cost will emerge in the near future. It is possible to 
store a big number of encrypted digital goods into a CD and sell it at a very low price 
(the price of the storage media). The decryption has to be paid when the contents are 
wanted. 
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4 Related Cryptography 

Private Information Retrieval 

In cryptography, the topic of retrieving information from a database without 
disclosing what the information is has been studied under the terminology of PIR 
(private information retrieval). The private information retrieval problem was first 
formalized and studied in [1]. The solutions provided in [1] are based on multiple 
databases and toward information-theoretical security. However, the assumption that 
the multiple databases would not communicate with one another is too strong for 
practical implementation. Later in [2], [3] and [5], private information retrieval 
schemes with single database were proposed. These solutions are based on 
computational assumptions, such as hardness of factoring n=pq. However, the 
computational costs of these solutions are very large due to their bit-by-bit processing 
manner. For example, the scheme in [3] needs a computation of 0(N) multiplications 
modulo a 1024-bit number for retrieval of only one bit, where N is the number of bits 
of the whole database. Such schemes can never be accepted for practical use. Any 
private information retrieval scheme for practical use should process messages file- 
by-file instead of bit-by-bit. 

The security for database is studied in [4], i.e., a customer should not be able to 
retrieve more messages than what he should retrieve according to the scheme. This 
issue was ignored in [2] and [3]. This issue is important in e-commerce of digital 
goods. 

All the previous PIR schemes aim at reducing communication cost. The scheme in 
[5] can even achieve a communication cost of poly(logA0 while those in [2], [3] and 
[4] have communication cost 0(N) for <1. From mathematics viewpoint, those 
schemes are beautiful research jobs. But from implementation viewpoint, those 
schemes are completely unpractical since they all require computation complexity at 
least O(A0. This make them infeasible even for a small database. Realistically, they 
are not necessary either. They ignore the different attributes of various Internet 
communication protocols as explained in Section 2. 

Using Commutative Cipher 

In this paper we use commutative encryption algorithm CE to achieve similar 
property of PIR. But our scheme is not a PIR scheme since we do not aim at reducing 
communication complexity. Instead, we build up our scheme based on the assumption 
that anonymous download is available. 

It is well known that not every symmetric key cryptosystem has commutative 
property. Actually, all the noted symmetric key ciphers are not commutative. Stream 
ciphers are commutative (just XOR), but they cannot be applied to our scheme. When 
more than one messages are encrypted with a same key by a stream cipher, there must 
be a unique number assigned to each message to indicate which session of the key 
stream is being used(or the number is integrated into the key so that different key 
streams are generated for different messages). That number must be accompanied 
with the ciphertext; otherwise, decryption cannot be done. In our scheme, such a 
number would definitely disclose which digital good the customer is trying to buy. 

We use a commutative cipher as follows. 
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Let /?=2g+l where primes p and q are public parameters of the system. The size of p is 
recommended to be 1024-bit. Let key S Z 2 , be an odd integer other than q. For any 
message M Z^, the encryption of M with key S is 

C=hf mod p 

Such an encryption is apparently commutative. But it may have security problems as 
to be explained in the next section. A modification is needed to make the cipher 
secure. The key point to secure the cipher is to add a padding format, i.e., instead of 
letting C=M^ mod p, we let C=(P(M))^ mod p, where P is a padding format. It is easy 
to find M from P(M). We will discuss P in the next section. 

The scheme presented in Section 2 can be expressed in more detail as follows. 



Merchant 



Customer 



Randomly choose fifi, ..., A'„ {0,1}*^® 
and a 160-bit odd integer S. 

Ci=E(Ki, Ml), Di={P(Ki)f mod p 
C2=E(K2, M2), D2=(P(K2)f mod p 



C„=E(/f„, M„), D„={P(K„)) ^ mod p 
7’=l/5' mod 2 q 



Randomly choose a 
160-bit odd integer R. 

Q=HR mod 2 q 



Anonymous download 
<C„ Di> 



U=D/^ mod p 



W=lf mod p 



U 



W 






P{Ki)=W^ mod p 



Proof of Legal W without Disclosing S 

In Section 3 we addressed the issue of non-repudiation. When a dispute occurs 
between the merchant and the customer, they can go to a third party for resolution. 
This requires the customer sign U and the merchant sign W in their interaction. Now 
the third party is convinced that U and W are indeed originated from the customer and 
the merchant, respectively. Still he should be convinced that U and W have the 
relationship W=lf mod p for T=HS mod 2 q. This can typically be done by the so- 
called proof of equivalence of discrete logarithms, where the merchant can prove to 
everyone that 
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W=lf mod p and Y=g^ mod p 

hold for a same T without disclosing T. Here Y and g are values published by the 
merchant in advance. 



5 Security Analysis 

First, it is easy to see that the customer can obtain the digital good by decrypting 
ciphertext C, since he can get Ki from P(/f,). In other words, if the customer and the 
merchant follow the scheme properly, the customer can obtain what he wants. We 
next show that the merchant cannot know which message the customer intends to get 
even if the merchant is trying to perform incorrectly. 

The merchant must figure out which D, the customer has obtained. Our assumption 
is that the merchant cannot do it from the customer's download operation. Hence the 
only available information from the customer is t/(=Z),^ mod p). Since R is randomly 
chosen and kept secret by the customer, the i remains information-theoretically secure 
(all i's are equally probable). So the privacy for the customer is perfectly protected, 
without any computational assumption. 

On the merchant side, it is required that the customer cannot get any extra 
messages. Without loss of generality, let's consider the following simplified situation. 
The customer has already retrieved Ki, K 2 , ..., Kf,. Now the 
customer tries to recover M,(i 1,2,...,/?), i.e., to find Ki, without the 
decryption help from the merchant. 

The problem is equivalent to this: 

Having Gi, gi, G 2 , g 2 , ■■■, G*, g/, such that Gi=(P(gi))'', G 2 =(P(g 2 ))'', 

..., G/,=(P(gh)y (mod p is omitted here), finding g, such that G, 

=(P(g;))'^ for some unknown r. 

There are two approaches to solve the problem The first one is to find r through 
Gj=(P(gj)y for j=\,2,...,h. But this is equivalent to computing discrete logarithm. 
Since the modulo p has 1024 bits, computing discrete logarithm is infeasible. 

The second one is to compute Gi from Gy, G 2 , ..., G^ through some arithmetic 
operations. This is dependent on how the padding format P is chosen. 

A bad P, such as P(K)=K, may lead to a flaw under an attack like that in [9]. This 
is because those K/s have only 128-bit size. Some of them may have a simple 
factorization. Even a simple attack may success with a tolerable computation 
complexity: a greedy customer gives U={DiD 2 f to the merchant for decryption. Then 
the customer obtains K 1 K 2 since P{K)=K. Now Ky and K 2 are all 128-bit integers. It is 
not very hard to find them by exhaustive tries following factorization. 

We conjecture that a simple padding format as follows should be safe. Let // be a 
one-way hash function H\ {0,1}* {0,1}'^°. 

P(^=//(^ll//^(/0ll//\/0ll/: lf{K)\\H\K)\\ty{K)H\K) 
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Another possible choice is to take the padding format the same as in OAEP [10], 

i.e., P(g)=g Ge,r{Rand)\\Rand H(g Ger(Rand)) where Ger is a pseudo random 
generator and Rand is a random number. 

Note. The padding P is applied only to Ki but never to U and W. Otherwise, the key Ki 
cannot be recovered. 



6 Concluding Remarks 

In this paper we propose an efficient and practical scheme for privacy protection in e- 
commerce of digital goods, which is based on a cryptographic technique, i.e., a 
commutative symmetric key cipher. However, it is not a general cipher from the 
cryptographic viewpoint. It is proposed specifically for our scheme. If it is used in 
other applications, security must be very carefully addressed. Also it is not as efficient 
as usual symmetric key ciphers. Although the exponential computation is involved in 
the cipher, the whole scheme is much more efficient than the previous PIR schemes. 
Another merit feature of our scheme is that is matches the overall structure of 
copyright protection where digital goods are encrypted with different keys and 
decrypted in tamper-resistant hardware/software. 
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Abstract. This paper proposes a new Internet bidding system that of- 
fers anonymity of bidders and fairness to both bidders and the auc- 
tion server. Our scheme satisfies all the basic security requirements for a 
sealed-bid auction system, without requiring multiple servers. 



1 Introduction 

An Internet auction system is somewhat similar to a normal non-Internet auction 
scheme, but differs from it in certain aspects. One major difference is that an 
Internet auction system requires a sealed bidding process over the network. A 
sealed-bid auction is one in which secret bids are protected from disclosure before 
the bidding deadline. After the deadline has passed, all the bids are opened and 
the winning bid is determined according to some deterministic rules. Typically, 
it is used in the auctioning of artwork, real estate and government contracts. 

A major challenge in developing a secure Internet auction is concerned with 
fairness between bidders and auction server. This includes the means of securing 
information exchange as well as achieving electronic payments. Secure informa- 
tion exchange provides secure and fair means for different, usually untrusted 
parties to exchange information via the Internet. These parties may never meet 
physically and can be geographically apart. Secure electronic payments need 
to ensure the operation and provide convenience for all the parties involved just 
like the conventional payment systems. To-date, there have been several research 
work on both fair exchange and secure payment issues [1,5, 3, 6, 7]. 

Some recent work in securing sealed-bid auction have been done in [11,14]. 
These systems consider the auction service to be distributed over multiple non- 
related servers. Security, secrecy and fairness of bids are achieved, provided that 
no more that a certain number of the servers are faulty. However in these systems, 
the bidders, while they may not necessarily trust any individual auction server, 
they must trust the auction service as a whole. A sealed-bid auction is a two- 
sided transaction. On one side, we have the bidders and on the other side, it 
is the auction service. These two entities usually have contrasting commercial 
interest(s), thereby making the requirement that one entity (bidders) to trust 
the other (the auction service) somewhat unrealistic in practice. 

In this paper, we consider a secure anonymous auction service where we 
place a minimal trust assumption between bidders and the auction server. In 
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our system, a bidder can make secret bids to the server for a bidding instance 
advertised in the Internet. In the bidding period the server has no idea on the 
bid made by the bidder before the end of a bidding period. After the deadline, 
only the anonymous identity associated with the highest bid is revealed, while 
all others remain unknown. It therefore ensures maximum bidder privacy. Our 
scheme also prevents the winning bidder from unauthorised withdrawal. The 
trusted party remains off-line until it receives a request from the auction server 
for decrypting a cheating- winner’s escrowed bid opener. Our scheme works with 
any on-line anonymous electronic payment systems. It does not require multiple 
auction servers that is deemed to be necessary for a sealed-bid auction [1,2]. The 
basic idea behind our protocols is similar to that for the digital cash protocol 
given in [8]. 

The rest of this paper is organized as follows. In section 2, we describe the 
security requirements for a sealed-bid auction. Section 3 reviews some prelim- 
inaries that are needed in our auction service scheme. Section 4 presents our 
new auction service scheme and describe the setup the bidding server and the 
bidder systems. Section 5 gives the bid casting protocols. Section 6 describes the 
determination of the winning bid and Section 7 concludes our work. 



2 Secure Sealed-Bid Auction 

Informally, a sealed-bid auction is a service in which services and goods are 
auctioned by an auction server. The goods and services are usually supplied by 
sellers. Each item is sold to a bidder through an auction. The auction consists 
of two phases. The first is the bidding period in which bidders submit sealed 
bids to the server. Once the bidding is closed, the second phase starts. In the 
second phase, all the bids are opened and a winner is determined and possibly 
announced. The choice of the winner is done using a publicly known deterministic 
process agreed between the seller and the auction server. For convenience, we 
assume that the winning bid is the highest bid. 

Secure an auction system involves several issues that are conerned with the 
relationship between the auction server and the bidders. The auction server may 
involve a large number of bidders whose behaviors may not be trusted. Also, as all 
the bidders compete against one another with a common objective of obtaining 
the goods/services, the system must ensure fairness for all parties. To achieve 
these objectives, the auction system must satisfy the following requirements: 

— Auction server 

Warranty of Funds/Elimination of Faulty Bids/Defaults Pre- 
vention: Auction server should be able to eliminate the bids that are 
not valid or have insufficient funds prior to the selection of the win- 
ning bid. Once a bid is awarded, it is infeasible for the bidder to default 
the bid. This prevents the winning bidder to default his/her bid as ev- 
idenced by the FCC auction in which 13 winning bidders defaulted on 
their bids[10] . This is crucial to the success of the auction as our system is 
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anonymous, i.e., the identities of the bidders are not known at any stage 
and it is infeasible and expensive to hold faulty bidders accountable. 

— Fairness: When the auction server delivers the goods/services to the 
winning bidder, it must be able to receive the payments regardless of 
how malicious the bidder may be. 

— Bidders 

— Privacy: The identities of bidders(winners and losers) must be pro- 
tected. 

— No Misuse of Bids: Once a bid is submitted, it should be used solely for 
the bidding purpose. It should be infeasible for anyone even the auction 
server to misuse the bid in order to disadvantage some bidders. Such cases 
occur when the server opens submitted bids and informs a collaborator 
of their amounts so that the collaborator can submit a higher bid. 

— Fairness: The bidders should be ensured that the winning bid is cho- 
sen properly. It is infeasible for the server to incorrectly award the 
goods/services to someone other than the winning bidder without being 
detected. The winning bidder should have sufficient evidence to prove 
the misbehavior of the auction server. The winning bidder must be able 
to receive the goods/services once the payment is made. Other bidders 
should not loose any funds in the auction. 

It is worth noting that from time to time, disputes may occur during a 
particular auction. Therefore it is necessary to assume the existence of a trusted 
authority to resolve these disputes. However, our model is optimistic [1,2] in that 
the trusted authority is not involved unless there is a fault. In most disputes 
that occur between the auction server and the bidders, in which the action does 
not benefit the seller, the seller could assume the role of the trusted authority. 
This holds for many disputes that we are aware of, except that the server (in 
collusion with the sellers) does not deliver the goods/service while still being 
able to obtain the payment from the winning bidder. 

In Franklin and Reiter’s system[ll], there are multiple servers, and each of 
these can receive bids from bidders. The information hiding is based on a (t,n) 
secret sharing scheme. That is, the information about a bid consists of a number 
of shares; each server can at most obtain one share for one bidder. Therefore, 
servers have no idea about the committed bid until the end of the bidding period. 
The discovery of the bid can be done by computing the secret bid by t out of n 
servers. 

3 Preliminary 

This section reviews some cryptographic primitives that are used in our auction 
protocol. 

Some common notations will be used throughout this paper: 

— Pi,P2' two large prime numbers such that p = 2<7 -|- 1 is a prime number as 

well, where q = piP 2 - 
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— Z*: a, primitive group of order p — 1. 

— Z*: a primitive group of order 4>{q) or (pi — l)(p 2 — !)• 

— "H: a strong one-way hash function. 



3.1 Optimistic Fair Exchange 

Optimistic fair exchange [2, 1,3] is a protocol that allows two untrusted parties 
to exchange their information in a fair manner. After the exchange, either both 
parties get the other’s information or neither party gets anything. The informa- 
tion held by the two parties can be of any format. It can be a file, a document 
or even a signature. Optimistic fair exchange protocol assumes the existence of 
an off-line trusted authority. However, the trusted authority is invisible and only 
becomes apparent in the case of disputes. 

Optimistic fair exchange makes use of a cryptographic primitive called verifi- 
able encryption. Verifiable encryption consists of a ciphertext under the trusted 
authority’s public key and a non-interactive proof that the plaintext correspond- 
ing to the ciphertext is indeed the required information. Verifiable encryption 
can be constructed for any type of information. For simplicity, we leave out the 
construction of verifiable encryption. The reader is referred to [1,4] for details. 

Let VE(m) be the verifiable encryption of the message m using the public 
key of the trusted authority {TA). The optimistic fair exchange between the two 
parties A and B, each holding the information a, [3 wanted by the other party 
respectively, works as follows: 

Fair Exchange Protocol (run in a secure channel). 

1: First A and B agree on the condition of the exchange. Let us assume that 
A initiates the exchange. 

2: A^B: VE{a) 

3: B^ A-. VE{f3) 

4: A ^ B: a 
5: B^ A: f3 

Then the exchange is completed. However, if the exchange is not successfully 
terminated, any party, who has received the verifiable encryption from the other 
party, runs the following recovery protocol with the trusted authority. Without 
the loss of generality, let us assume that A is the party involved. 

Recovery Protocol (run in a secure channel) . 

1: A — >■ TA: a, VE{(3) and information about the exchange. 

2: TA decrypts VE{l3) to get (3. Then TA A: (3 and TA B : a. Here TA 
should verify all the information and send a,/3 to the respective parties if 
and only if TA is satisfied with all the checks. 



Note that the recovery protocol is always fair. Upon completing the protocol, 
both parties will get the other’s information. In fact, in the exchange protocol, 
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B is never disadvantaged. The only possible unfair state is when B gets a, while 
A does not receive (3. However A will never send a to B unless A has received 
VE{(3). With VE{(3), A can run the recovery protocol with TA to obtain (3. 
Hence, at no stage A is disadvantaged, i.e. the protocol is fair for both A and B. 

For our auction service, we are mainly interested in the fair exchange of 
digital signature knowledge [16]. In [1], an optimistic fair exchange protocol of 
digital signatures was proposed for several digital signature schemes. It can be 
used to exchange signature knowledge as well. For convenience, we denote by 
OFE{a, b) an Optimistic Fair Exchange of a and b. 

3.2 Blind Nyberg-Rueppel Digital Signature 

Assume that x G Zg is the secret key of the signer, h = modp is then the 
public key of the signer, where g G Z*. g,q,p are public. 

To obtain a blind Nyberg-Rueppel digital signature on a message m from 
the signer, the verifier or the signature receipient needs to get a pair (r, s) in the 
form: 



r = mg^ modp, {k Gr Zq) 
s = xr + k mod q, 

in such a way that the signer does not learn anything about either r or s. This 
can be achieved using the following process: 

1: The signer selects k G Zq, computes r = g^ modp, and sends f to the verifier. 
2: The verifier selects a,(3 G Zq, computes r = mg°^f^ modp, fh = r(3~^ and 
sends m to the signer. 

3: The signer computes s = fhx + k and then forwards s to the verifier. 

4: The verifier computes s = SP + a mod q. 

The pair (r, s) is then a blind signature of the signer on message m. The verifi- 
cation of the signature (r, s) for message m is done by verifying 

g-^hW = = ^g-fhx/3-kl3+xr+kf3 ^ ^ 

Furthermore, as a and /3 are randomly chosen, the signer does not learn anything 
about (r, s). For a given signature (r, s), there exists an unique pair of a and 
p. Thus for each signature from the signer, the verifier can generate only one 
blind signature. Detailed discussion on the security of this scheme can be found 
in [15]. 

3.3 Proof of Discrete Logarithm 

We now take a look at the scheme of proving knowledge without revealing any- 
thing about the content that is being proved. This proof scheme was initially 
proposed in [12,13]. We briefly summarise here the method of discrete loga- 
rithm proof. The objective of the proof is that, given a primitive g Gr Z* and 
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Q = g° modp where o & Zq is the secret known to the prover only and g is 
public, the prover proves her knowledge on o but not revealing the secret to the 
verifier. The proof protocol can be either interactive or non-interactive. Here we 
consider the interactive one only: 

1: The Verifier: sends a challenge c to the Prover. 

2: The Prover: 

— selects a secret number S Gr Zq and then computes h = modp and 
w = (c o -I- i5) mod q. 

— sends c, g, h, g to the Verifier. 

? 

3: The Verifier then checks: = g‘^h modp. 

For convenience, we denote by DLP[o : p] the discrete log proof. 

4 System Setup 

Our system consists of an auction server S, a/several financial institution (s), 
several bidders. We denote by a bidder and by .7^ a financial institution. There 
are two possible design choices regarding Trusted Third Party (TTP): (1) There 
exists an independent TTP, who acts as a key escrow agent, and (2) The financial 
institution also acts as a TTP. For simplicity, we assume that the financial 
institution is also the TTP in our system. 

4.1 The Server Setup 

The auction server should not know the bids of bidders before the bidding dead- 
line. This is ensured by the bid structure itself and a TTP rather than secret 
sharing. Bidders are confident that their bids are safe, since the winning bidder is 
the only party who sends its bid opener to the server once it is informed that its 
bid is the highest. With a bid opener escrow scheme, the server can also prevent 
any winning bidder from withdrawal. 

The auction server has the following tasks: 

— Displays auction objects on its web page. 

— Receives/ Validates the anonymous bids from bidders. 

— Receives/ Validates the escrowed bid openers 

— Implements a fair exchange protocol. 



4.2 Bidder Setup 

Each bidder needs to establish anonymous account at the financial institution. 
We assume that an alias or anonymous identity has been issued to the bidder B. 
Using it, B can obtain an anonymous account from the financial institution. Let 
A = g^ modp be the authorised alias for bidder B. a is a, secret number known 
only to B. Let A be the legitimate account name for B. For the bidding purpose, 
B should obtain an account certificate associated with A by the following process: 
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We assume that the financial institution has a web page providing on-line 
service. In order to obtain an account certificate, the bidder accesses the web 
and sends her request. The details of the process are as follows: 

Let (p, q, g) be public information, where g € Z* and g G Z*, i.e., g‘^ modp = 
1 and modg = 1. Let x be iF’s secret key and h = g^ modp and h' = 
g^ mod q be its public keys. T chooses two random numbers Wi and W 2 , computes 
gi = g^^ modp, g 2 = modp as well as hi = gf modp and /12 = modp. 
5i) 52 , hi and ft -2 are then made public. Each bidder needs to pre-determine 
its bid and the bidding commitment v = gig 2 modp, where u is the value of 
concatenation of the bid and a secret random number. Without loss of generality, 
we will call u the bid for B. The bidding commitment v is registered with the 
financial institution as B's current identity is associated with A. B is given w = 
modp as the certificate of B's identity. It might also be necessary for T to 
prove to B that log„ w = logg g^ . This can be done using a bi-proof of discrete 
logarithm. Note that v, y, w are used for one bid only. 

Bidding setup protocol: 

0: B sends her discrete log proof DLP{[a ■. A\ to T for the authentication 
purpose. 

1: T chooses a random number k & Zq, computes S = modp and forwards 
6 to B. 

2: B generates three random numbers (y,Xi,X 2 ), computes a = modp, 
P = modp and A = h^^h^^ modp. 

3: B forms the message m = T-L{a, generates a random number a and 
a Nyberg-Rueppel blind factor b, calculates r = modp and sends 

m' = r/b to T. 

4: B computes its Nyberg-Rueppel signature on the blind message w! by form- 
ing s' = m' X -I- fc to and sends it to T . 

5: B removes the blind factor b and obtains s = s'b + a = rx + kb + a. 



B 




T 




DLP\cr:A\ 
^ ^ 


k G Zq 






5 modp 


y,Xl,X2 G Zq 
a ^ modp 

(3 modp 

A •<— h^^h 2 ^ modp 
m ^ T-L{a, (3, A) 


5 

i 




a,b G Zq 

r G- 171(3°' modp 






m' G- r/b mod q 


m' 

)> 






s' 


s' ^ m' X + k mod q 


s <— s'b + a mod q 


< 
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Certs = {c«j /?) A, r, s} represents a valid anonymous account certificate for 
the bid u. It can be verified using the following equation: 

= /3“®a’’r modp. 

Certs is used for one bid only. 

5 Bidding Protocol 

When B wishes to cast a bid to the auction server S, she needs to obtain her 
unique bidding number N and bidding account certificate Certs with respect to 
her bid u from the on-line T . The bidding protocol is divided into two stages: 
casting a bid and sending an escrowed bid opener. S is able to verify the correct- 
ness of the bid and the corresponding escrowed opener. The following protocol 
is used for casting a bid: 
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1: S generates a random challenge c and sends it to B. This challenge should 
be unique for each bid. For example, it can be computed as 

c = ^{{SWDateWTimeW . . .) 

2: Upon receiving c, B computes a response (ri,r 2 ) with respect to her bid, 
u, where ri = xi + cy mod q and T 2 = X 2 + ucy mod q. The bidding token 
(ri, T 2 ) along with her bidding number N and the account certificate is then 
sent to S. 

3: S checks the Nyberg-Rueppel signature on the message "H(a,/3, A) and ver- 
ifies that (ri,r 2 ) are indeed consistent with the challenge c. S accepts the 
account certificate for the bid if "H(a,/3, A) = and = a°A. S 

then stores {c,ri,r 2 ,Certs)- It is obvious that if the bidder and the server 
follow the correct procedures given above, then "H(a,/3, A) = (3~‘^d^r and 
= a°A must hold. In fact, we find that 

a^A = 

= i9i9^r^h^,^h^2^ 

_ ^xi+cy ^X2+cyu 

= modp. 

4: S sends a new challenge, d , to B for computing the escrowed bid opener. 

5: B computes the bid opener (r^, r^), r'^ = Xi+dy mod q, = X 2 +uc'y mod q, 
which are then encrypted with the TTP’s public key h, 

^&R-Z,j){q), k=g^ modq,h' = h''^ modq, mod g, i ?2 = fcr 2 modg, 

where e and k are kept secret. In order to allow S to prove that the escrowed 
opener is correct, B needs to compute 

a' = a*. A' = A^ ( modp). 

Ri, i? 2 , h',a', and A' are then sent to S. 

6: S checks correctness of the escrowed bid opener, A' modp 

There are two additional proofs are needed from the bidder: (1) The knowledge 
proof on representations of discrete logarithms that proves r[ and are indeed 
encrypted correctly, and (2) The discrete log proofs on the equalities of log^/ h' = 
loggOoga a') and log^, h' = logg(log;, A'). 

For the first proof, the readers are referred to [17] for details. We here use 
the notation: KREP[{e, 1) : h''^ A which means that the discrete logarithm 
h' to base h' and a representation of R\ to bases g and r'^, and the g-part of this 
representation equals the discrete logarithm of h' to base h' . Similarly to that, 
we have KREP\{e^ 1) : h'^ A The second proof is realised as follows: 



1: The bidder (prover): 
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— Selects an integer S and computes f = modg. 

— Sends / to the server S. 

2 : The server S sends a challenge r to the bidder. 

3 : The bidder: 

— Computes t = e + tS. 

— Sends t to S. 

4 : The server S: 

— Checks h'* = h' modg, a®* = modp, and A®* = modp. 

This protocol can be easily converted into a non-interactive version. 



6 Opening the Winning Bid 

After the closing day of bidding, all bidders anonymously submit the values of 
their submitted bids along with their bidding numbers to the auction server. 
The auction server then broadcasts the highest bid and a new challenge, c". The 
bidder responsible for the highest bid computes (r",r2) using c", 

r" = xi + c"y mod q, r'2 = X2 + uc"y mod q, 

B and S then carry out an Optimistic Fair Exchange on r” ^ {N,r”,r'2) and 
the auction object R, i.e, OFE{r” , R), where R could be either the electronic 
auction object or a receipt that ensures the physical delivery of the auction 
goods at a later stage. As part of optimistic fair exchange process, the server 
must check the legitimacy of r” as follows. The server computes the bid u and 
the current alias v: 



ri — r'{ Xi + cy — Xi — c”y 
c — c" c — c" 

f2 — T2 X2 + cyu — X2 — uc”y 



= y, 

= yu. 



From y and yu, S can easily obtain the bid value u and check if u equals the 
highest bid. After obtaining u, S computes the bidder’s anonymous bidding 
identity v = 51 modp. v is then sent to F, who can find the match between 
this value and the one in the iF’s alias list. With such a match, T can transfer the 
money from ,B’s account represented by anonymous identity A to 5 ’s account. 
The evidence is undeniable as u is the bidder’s secret information, which is not 
possible to determine unless the bidder sends r” . 

If the winner refuses to send r" to the server, the server can send Cert 13 and 
(i?i, i?2, c') to the TTP, who can then decrypt (i?i, i?2) and send (r'^, r^) to the 
server. The server can compute u and v using (ri,r2,c) and (rj^, r^, c'). 
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7 Conclusion 

We have proposed a new anonymous sealed-bid action scheme for the Internet. 
Our scheme satisfies all basic requirements for a sealed-bid auction, without re- 
quiring multiple servers. In our system, each bidder has an anonymous account 
with a financial institution. Before casting its bid, each bidder logs onto the 
financial institution’s on-line web page to obtain an account certificate, associ- 
ated with its account, for its bid. All bids submitted to the auction server are 
protected from disclosure, before the auction deadline. After the deadline, only 
the anonymous identity associated with the highest bid is revealed, while all 
others remain unknown. It therefore ensures the maximum bidder privacy. Our 
scheme also prevents the winning bidder from unauthorised withdrawal, since 
an escrowed bid opener is sent to the server who can prove the escrowed opener 
is legitimate. The TTP remains off-line until receiving a request from the server 
for decrypting a cheating-winner’s escrowed bid opener. 

Additional note: After this paper has been submitted, we noted that a similar 
method has been just proposed in [18]. 
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Abstract. This paper proposes the hrst sealed-bid auction method which 
uses only hash function. We use a hash chain to commit a bidding price. 
By using the hash chain, we can drastically reduce the time needed for 
bidding and opening bids. If we use a practical hash function e.g. SHA-1, 
our method is 200,000 times faster than former methods that use public 
key cryptosystems. Accordingly, our method is capable of wide applica- 
tion in terms of the number of bidders and the range of bidding prices. 



1 Introduction 

Auctions are a basic and important method to establish the price of goods. As 
the importance of the Internet continues to increase, electronic auction services 
are also increasing. There has been a lot research on auctions. However, existing 
sealed-bid auction methods use public key cryptosystems and so are compu- 
tationally expensive or require a lot of communication. The inefhciency of the 
public key cryptosystems forces the auction methods to limit the number of bid- 
ders and the range of bidding prices. Moreover, additional trusted third parties 
may be needed to ensure the secrecy of bidders. 

This paper proposes the hrst sealed-bid auction method which uses only hash 
function. We use a hash chain to commit a bidding price. Though a hash chain 
method has been used for authentication e.g. SKEY [LamSl] or payment e.g. 
PAYWORD [RS96], the authors found a new application : sealing and compar- 
ing bids where the bidding prices except for the winning price is kept secret 
[KM99] [Suz99] [KMSHOl]. The hash chain drastically reduces the time needed 
for bidding and opening bids. If we use a practical hash function e.g. SHA-1 
[FIPS], our method is 200,000 times faster than former methods that use public 
key cryptosystems. Accordingly, our method is widely applicable in terms of the 
number of bidders and the range of bidding prices. 

In Section 2, we explain sealed-bid auction and its requirements, and review 
two previous works. In Section 3, we introduce our sealed-bid auction method 
and discuss its security and efhciency. In Section 4, we conclude the paper. 

1.1 Related Works 

Kikuchi, Harkavy and Tygar showed the necessity of bid secrecy, and presented a 
sealed-bid auction method that uses a distributed decryption technique [HTK98] 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 183-191, 2001. 

Springer- Verlag Berlin Heidelberg 2001 
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[KHT98]. Kudo used a time server to realize sealed-bid auctions [Kud98]. Naor, 
Pinkas and Sumner realized sealed-bid auctions by combining Yao’s secure com- 
putation with oblivious transfer [NPS99]. The method is capable to compute any 
circuit, so can realize various types of auctions (e.g. Vickrey auction). Cachin 
proposed a sealed-bid auction using a homomorphic encryption and oblivious 
third party [Cac99]. The complexity of the method is in polynomial of loga- 
rithmic of the number of possible prices. Stubblebine and Sy verson proposed 
an open-bid auction method which uses a hash chain technique [SS99]. Sakurai 
and Miyazaki proposed a sealed-bid auction in which a bid is represented by 
the bidder’s undeniable signature of his bidding price [SM99]. Sako proposed a 
sealed-bid auction in which a bid is represented by an encrypted message with 
a public key that corresponds to his bidding price [SakOO]. 

However, these sealed-bid auction methods use public key cryptosystems and 
so are computationally expensive or require a lot of communication. The inefh- 
ciency of the public key cryptosystems forces the auction methods to limit the 
number of bidders and the range of bidding prices. 

2 Sealed-bid Auction 

2.1 Sealed-bid Auction 

The sealed-bid auction is a type of auction in which bids are kept secret during 
the bidding phase. In the bidding phase, each bidder sends his sealed bidding 
price. In the opening phase, the auctioneer opens the sealed bids and determines 
the winning bidder and winning price according to a rule : the bidder who bids 
the highest (or lowest) price of all bidder is the winning bidder and their bidding 
price is the winning price. ^ Since there is no difference essentially between the 
highest price case and the lowest price case, we discuss only the former hereafter. 

Bidding : Let A be an auctioneer and Bi, ■ ■ ■ , be m bidders. The auc- 
tioneer shows some goods for auction (e.g. a painting from a famous painter) and 
calls the bidders to bid their price for this item. Each bidder Bi then decides 
his price, and seals his price (e.g. by envelope) and puts his sealed price into 
auctioneer’s ballot box. 

Opening : After all bidders have input their sealed prices, the auctioneer 
opens his ballot box. He reveals each sealed price and Ends the bidder who bid 
the highest price. The bidder who bid the highest price wins and he can buy the 
item at his bid price. 

2.2 Requirements 

To achieve a fair auction, the sealed-bid auction protocol must satisfy the fol- 
lowing properties. 

Secrecy of bidding price : All bidding prices except winning price must 
be kept secret even from the auctioneer. If the auctioneer can see some bidding 

^ Here we accept the case where several bidders submit identical highest bids. 
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prices before opening, he can tell the prices to a colluding bidder in order to 
cheat the auction. 

Verifiability : Anyone must be able to verify the correctness of the auction. 
As above, all bidding prices except winning price are kept secret. So verihability 
is necessary to convince all bidders. 

Undeniability : No bidder must be able to deny his bidding price. If a 
bidder can deny his bidding price, two bidders can collude to cheat the auction 
as follows . Two bidders Bi and B 2 collude and bid their prices pi and p 2 = pi — 1. 
If no bidder bid at the price higher than or equal to p\, bidder B\ denies his bid 
at price p\ and bidder B 2 wins the auction at his price p 2 - If some bidders bid 
at price p\, bidder B\ does not deny his bid at price p\ and wins the auction at 
his price p\. By this, bidders B\ and B 2 can bid at two selectable prices p\ and 
P 2 = Pi — 1 and so cheat the auction. 

Anonymity : The bidders must be able to bid anonymously. It is impor- 
tant to keep the privacy of bidders to prevent collusion between bidders and 
auctioneer. 

2.3 Previons Work 

To compare with our methods 1 and 2, we review here two former methods. 
These methods as well as our methods satisfy above requirements. 



Former Method 1. Sakurai and Miyazaki proposed a sealed-bid auction in 
which a bid is represented by the bidder’s undeniable signature of his bidding 
price [SM99]. We describe their protocol as a good reference against which we 
compare our method 1. 

Bidding : Each bidder B chooses his bidding price p from the price list 
P = {1,2, •••,«}. He creates an undeniable signature Sigsip) of his bidding 
price, and publishes it as the commitment of his bidding price. 

Opening : After all bidders commit their bidding prices, the auctioneer 
iterates following steps for j = n,n — 1, ■ ■ ■ io open bids. 

If bidder B prove that his bidding commitment Sigsip) is a valid signature of 
price j by the conhrmation protocol of the undeniable signature, the auctioneer 
is convinced that the winning bidder is bidder B and winning price is j . 

If all bidders prove that his bidding commitment Sigsip) is not a valid 
signature of price j by the repudiation protocol of the undeniable signature, the 
auctioneer is convinced that no bidder bid at price j . Then auctioneer decreases 
j by 1 and repeat the above steps. 

This protocol satisRes all the above required properties. However, all bidders 
have to communicate with auctioneer in opening phase. 



Former Method 2. Sako proposed a sealed-bid auction in which a bid is repre- 
sented by the encrypted message with the key which corresponds to the bidding 
price [SakOO]. We describe her protocol as a good reference against which we 
compare our method 2. 
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Let E and D be probabilistic encryption and decryption function. Auc- 
tioneer publishes keys Kj and messages Mj corresponding to each prices j G 
{1,2, 

Bidding : Each bidder B chooses his bidding price p from the price list 
P = {1,2, •••,«}. He creates his encrypted message Cb = EKj,{Mp) with the 
key Kp which corresponds to his bidding price p, and publishes it as his bid. 

Opening : After all bidders send their bid, the auctioneer iterates following 
steps for j = n, n — 1, ■ ■ ■ to open bids. 

If relation Dkj{Cb) = Mj holds, the auctioneer is convinced that the winning 
bidder is bidder B and winning price is j. 

If relation Dkj{Cb) = Mj does not hold for all bidders, the auctioneer is 
convinced that no bidder bid at price j. Then auctioneer decreases j by 1 and 
repeat the above steps. 

This protocol satishes all the above required properties. Moreover, bidders 
need not to communicate with auctioneer in opening phase. However, malicious 
auctioneer can reveal all bidding prices. To prevent this, we have to use plural 
auctioneers and distributed decryption technique. 

3 Sealed-bid Auction using Hash Chain 

Our sealed-bid auction method is the hrst method which uses only hash function. 
It is based on a chain of a hash function. By using a hash chain we can drastically 
reduce the time taken for bidding and opening bids. If we use a practical hash 
function e.g. SHA-1, our method is 200,000 times faster than former methods 
that use public key cryptosystems. The method resembles SKEY in the point 
that the proof of the commitment plays the role of the next commitment, and 
PAYWORD in the point that the number of the iteration of the hash function 
represents the value. 

3.1 Proposed Sealed-bid Auction Method 1 

The hrst proposed sealed-bid auction method is as follows. 

Preparation : Let A be an auctioneer and B\, B 2 , ■ ■ ■ , 5m be bidders. Auc- 
tioneer A publishes 

— 5 = {1, 2, • • • , n} : a price list of the auction, 

— h : {0, 1}®"*"* ^ {0, 1}* : a collision intractable^ random hash function, 

— MnoyMyes G {0, 1}® : messages for “I do NOT bid” and “I DO bid”. 

Bidding : In the bidding phase, each bidder Bi decides his bidding price 
Pi G 5. He chooses his secret seed Si G {0, 1}* randomly and computes a hash 
chain 

— — 5 * , Lp^i — hi^My Qg\\L p^ — \if 

^ In this paper “collision intractable” means that to hnd x,y s.t. h{x) = h{y) is 
compntationally infeasible. We consider a practical hash fnnction e.g. SHA-1. 
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Lj^i — h{Mno\ a — Pi + 1, Pi + 2, • • • , n). 

He then sends the commitment of his biding price with his signature 

(-^n 7 • 

to auctioneer A. Auctioneer A receives all bids {Ln^i, SigB,{Ln,i)) (* = 
verihes the signatures and publishes all bids. 

Opening : In the opening phase, auctioneer A iterates following steps for 
prices j = n,n — 1, ■ ■ ■ . 

Auctioneer A receives messages (* = 1, 2, • • • , m) from all bidders Bi 

and checks the following equality for i = 1, 2, • • • , m 

Lj,i — I — l,z ) • 

If the above equation holds for all i’s, auctioneer A concludes that no bidder 
bids price j. Auctioneer A then publishes all messages (* = 1,2, •• • ,m), 

decreases price j by 1, and receives messages Lj_ 2 g (i = 1, 2, • • • , m)... 

If the above equation does not hold for some i’s, auctioneer A checks the 
following equality for these i’s 

Qg\\Lj — \i^ ^ 

and concludes that the winning bidders are these 5j ’s and their winning price 
is j . Finally, auctioneer A publishes the winning bidders, the winning price and 
all messages Lj_i i (i = 1, 2, • • • , m). 

Notice that we can omit the messages M„g i.e. each bidder bids L„ i = 
h"~P+'^(Myes\\Si) with his signature. 

We can also realize a sealed-bid auction with quantity as follows. Each bidder 
Bi decides his bidding quantity for each price j and sets the messages 

Mj i = “bidding quantity of bidder Bi at price j ”. 

He then makes his hash chain Lj i = h{Mj i\\Lj-i i), and bids his commitment 
with his signature (L„ i, SigB^(L„ i)). 

3.2 Proposed Sealed-bid Auction Method 2 

In our Rrst method, each bidder has to communicate with auctioneer to open 
his bid. In our second method, since each bidder send the secret seed of his hash 
chain to auctioneer, he does not need to communicate to open his bid. To keep 
the secrecy of bidding price, we use plural auctioneers who cooperate to open 
bids. The secrecy of bidding price is then kept besides the case all auctioneers 
collude. The second proposed sealed-bid auction method is as follows. 

Preparation : Let Ai,A 2 ,---,Aa be auctioneers and Bi, B 2 , ■ ■ ■ , B^^ be 
bidders. Auctioneer A\ publishes 

— P = {1, 2, • • • , n} : a price list of the auction, 

— h : {0, 1}* ^ {0, 1}* : a collision intractable random hash function. 
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Bidding : In the bidding phase, each bidder Bi decides his bidding price 
Pi G P. He chooses his secret seed Si^i, ■ ■ ■ Sa^i G {0,1}* randomly and 
computes his bid 

Bidi — {^z, Cl z, C2^z z ' ' ' z Ca,z'} 

where 

hi = h(hP‘(Si^i)\hP'(S2,i)\ ■ ■ ■ \hP'(Sa,i)),C^,i = h^+HSj,i). 

He then sends the commitment of his biding price with his signature 

(Bidi, SigB, (Bidi)). 

to auctioneer Ai. Auctioneer Ai receives all bids (Bidi, SigB,(Bidi)) (i = 1, 2, • • • 
verihes the signatures and publishes all bids. After all bids are published, each 
bidder Bi sends his secret seed Sjg to each auctioneer Aj through a secure 
channel (e.g. using encryption with auctioneer Aj ’s key). 

Opening : In the opening phase, auctioneers Ai,A 2 ,---,Aa cooperate to 
open the bids. First, each auctioneer Aj check the relation 

Cj,i = h (Sj,i) 

to verify the validity of secret seed Sjg. Then auctioneers Ai, A 2 , • • • , A^, iterates 
following steps for prices k = n,n — I,---. 

Each auctioneer Aj publishes the values h^(Sj i) (i = 1, 2, • • • , m). Auctioneer 
Ai then checks the following equality for i = 1, 2, • • • , m 

bi = h(h\Sij)\h\S2,i)\---\h\Sa,i)). 

If the above equation does not hold for all i’s, auctioneer Ai concludes that 
no bidder bids price k. Auctioneer Ai then publishes all values h^(Sj i) (j = 
l,2,---,a, i = l,2,---,m), decreases price k by 1, and each auctioneer Aj 
publishes the values h^~^(Sj i) (i = 1,2, ■ ■ ■ , m)... 

If the above equation hold for some i’s, auctioneer Ai concludes that the 
winning bidders are these 5j’s and their winning price is k. Finally, auctioneer 
Ai publishes the winning bidders, the winning price and all values h^(Sj i) (j = 

lz2,---,a, i = l,2,---,m). 

3.3 Security 

The proposed method satisRes the properties listed in Section 2. 

Secrecy of bidding price : Since bids are opened from the highest price to 
the price at which the wining bidders bid, no bids under the winning price are 
opened. Due to the one-wayness of the hash function, no one can compute the 
rest of the hash chain that has not yet been revealed. Due to the randomness of 
the hash function, no information about the rest of the hash chain that has not 
yet been revealed can be obtained. Thus all bidding prices except winning price 
are kept secret even from the auctioneer. In the case of method 2, since the hash 
chain is distributed to plural auctioneers, all bidding prices except winning price 
are kept secret besides the case all auctioneers collude. 
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Verifiability : Since values of hash chain which are opened are published, 
anyone can verify the correctness of the hash chains. Due to the collision in- 
tractability of the hash function, no one can change the hash chain without 
changing commited head of hash chain. Thus anyone can verify the correctness 
of the auction. 

Undeniability : Since each bid has the bidder’s signature, no bidder can 
deny his bid. Due to the collision intractability of the hash function, no one can 
change the hash chain without changing commited head of hash chain. So no 
bidder can deny his bidding price. 

Anonymity : We can introduce a registration center to register bidders and 
issue a pseudonym (= public key of signature) for each bidder. Each bidder can 
participate in an auction using his pseudonym and so bid anonymously. 

By these properties, our method achieves a fair sealed-bid auction. 



3.4 Efficiency 

We compare our method 1 to former method [SM99] and our method 2 to former 
method [SakOO] in terms of the volume of data communications and computa- 
tional complexity. We consider only the opening phase, since the load factor of 
this phase is the highest of all phases. Table 1 shows the comparison result in 
the worst case. Here, m bidders bid and there are n bidding prices. In the case 
of our method 2 and former method [SakOO], the number of auctioneer is a. 



Table 1. The load of the opening phase. 



Method 


Volume of data communication (bit) 


Our method 1 


ihash ^ ~ IGOmra 


[SM99] 


iexp X lOmn = 10240mn 


Our method 2 


0 


[SakOO] 


0 


Method 


Computational complexity (/ts) 


Our method 1 


Tpash ^ ~ 3.6mn 


[SM99] 


Texp X lOmra = SOOOOOmra 


Our method 2 


Tpash ^ (umra -|- mn) = 3.6(amn -h mn) 


[SakOO] 


Texp X amn = SOOOOamra 



Here, T}^ash -^exp denote the size of the output data of the hash function 
and the size of modular exponentiation, respectively. Similarly, Tj^^sh 2^exp 
denote the time taken for performing a single hash function evaluation and 
a single modular exponentiation, respectively. For the hash function SHA-1, 
Thash ^hash Pentium II 233MHz (translated from 

the original data 55.1 Mbit/s on Pentium 90MHz [Bos]). For 1024 bits modular 
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exponentiation, Lexp is 1024 bits and Texp is 80000 yus on Pentium II 233MHz 
[Sha] . 

Our method is 200,000 times faster than former methods. Thus, our method 
supports a large number of bidders and a wide range of bidding prices. 

4 Conclusion 

We have proposed the Rrst sealed-bid auction method which uses only hash 
function. In our method, the length of the hash chain represents the bidding 
price, and the bidder uses the head of his hash chain to commit his bid. We 
showed that our method satisfies all the requirements of sealed-bid auctions. 
The method drastically reduces the time taken for bidding and opening bids. 
If we use a practical hash function e.g. SHA-1, our method is 200,000 times 
faster than former methods that use public key cryptosystems. Accordingly, our 
method is widely applicable in terms of the number of bidders and the range of 
bidding prices. 
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Abstract. Electronic payment systems for wireless devices need to take into ac- 
count the limited computational and storage ability of such devices. Micropay- 
ment schemes seem well suited to this scenario since they are specifically de- 
signed for efficient operation. Most micropayment schemes require a digital 
signature and therefore users must support public key operations and, further- 
more, a public key infrastructure must be available. Such schemes are not suit- 
able for current wireless systems since public key technology is not supported. 
We examine the SVP micropayment scheme which overcomes this problem by 
using only symmetric key cryptography and relying on tamper resistance. Some 
limitations are observed in the SVP micropayment scheme and an enhanced 
scheme is proposed suitable for current generation wireless communications. 



1 Introduction 

Over recent years, there has been a significant increase in both the scale and the di- 
versity of electronic transactions over the Internet. Electronic commerce (E- 
commerce) means electronic payment (E-payment) in a narrow sense but it may mean 
electronic business in a broader sense which also includes the exchange of informa- 
tion not directly related to the actual purchasing activities [GrFe99]. In this paper, the 
term E-payment will be used to describe purchasing activity itself between buyers and 
sellers. Secure electronic payments will not only make purchasing activities more 
flexible and convenient but also create as yet unimagined new markets [Wayn97]. 

As the integration of computing and communications continues, wireless comput- 
ing devices are of increasing importance as devices to access the Internet and to make 
electronic transactions. Many E-payment schemes have been proposed, and a lot of 
them assume the use of today’s well-established credit card business environment. 
The most well-agreed and dominant E-payment protocol is the SET (Secure Elec- 
tronic Transaction) protocol, produced by Visa and MasterCard to be their standard 
for processing credit card transactions over networks like the Internet. This, and other 
similar schemes, use extensive cryptographic technologies, a lot of which are based 
on public-key cryptography, to satisfy high level security requirements such as nonre 

D. Won (Ed.): ICISC 2000, LNCS 2015, pp. 192-205, 2001. 
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pudiation. These protocols are all appropriate for medium to large transactions (mac- 
ropayments) of more than $5 or $10. 

These macropayment protocols will be too expensive and time-consuming when 
applied to inexpensive transactions, 50 or 25 cents and less, because of the transaction 
charges of card companies and the computational cost of public-key signa- 
ture/verification. They also place a heavy burden on the computational and storage 
capabilities of currently available wireless devices. Without appropriate cheap alter- 
native schemes, the light-weight transaction market cannot be developed to its full 
potential. This market typically includes selling inexpensive information software, and 
services (e.g. directory search or games), usually delivered online, and is a prime 
market for wireless communications. 

Several schemes have been proposed for micropayment. To reduce the computa- 
tional and signalling burden down to a reasonable level which can be justified in mi- 
cropayment environments, they try to avoid public-key cryptography partially or en- 
tirely. Their dependency on on-line access to banking/clearing systems is also small 
compared to macropayment schemes. In this paper, we focus on micropayment 
schemes because this category not only directly addresses the limited resources of 
mobile communications but also is the most reasonable option for applying to the 
light-weight E-payment by mobile users. 

Figure 1 shows a general scenario in E-payment environments, adopted from 
[AhuJ96]. 



7. Update consumer 
with account status 




Browser 



1. Select a store 



-3. Present home page 

4. Select goods, 


Web Site 


Make payments 


Merchant System 


6. Confirm payment 


‘ 2 . Link to merchant server 


Web Site 


Shopping Mall 





Fig. 1: E-payment Environments 

In this scenario, there are four elements or role players as follows. 

• A consumer along with a Web browser uses the hyperlinks from the mall to ac- 
cess the merchant’s home page. 

• A merchant system residing on an online Web server with a connection to Web 
browsers over the Internet consists of the home page and related software to man- 
age business. 
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• An online shopping mall may help direct consumers to the merchant server. It 
may pay to enlist with one or more well-known shopping malls. 

• A background banking network supports electronic payments from consumers 
to the merchant. This network may include two types of banks: 

- a merchant’s bank maintains the account for the merchant, authorizes and 
processes the payments. It may use on-line real-time link to the merchant so as 
to allow online authorization of consumer payments, and the link with the con- 
sumer’s bank for verifying the transactions. 

- a consumer’s bank manages the account for the consumer, and has an offline 
link to the consumer, such as via postal mail or e-mail. 

These four role players take part in the following sequence of E-payment related ac- 
tivities. 

1. The consumer accesses the shopping mall and selects a shop for purchasing cer- 
tain items. 

2. The shopping mall server accesses the merchant system for the selected shop. 

3. The merchant system presents the store’s home page to the consumer. It also 
includes information on the various goods available from this store. 

4. The consumer selects the desired goods, interacts with the merchant’s system, 
and makes the payments. 

5. The merchant system accesses its bank for authorization of the consumer pay- 
ment. 

6. The merchant system informs the consumer that the payment is accepted and the 
transaction is completed. (At a later time, the merchant’s bank obtains payment 
from the consumer’s bank.) 

7. The consumer’s bank informs the consumer of the money transfer through mail 
such as a monthly report or online bank account. 



2 E-Payment Mechanisms in Mobile Environments 

The European ASPeCT project [ASPe96] has investigated security services for next 
generation wireless communications. The project included proposals for secure billing 
using hash based micropayments in combination with a digital signature scheme 
[ASPe97]. The digital signature serves a dual role as it is also used for authentication 
purposes during the establishment of a session key to secure the session data 
[HoPr98]. 

Provision of secure and trustworthy E-payment mechanisms will be the most criti- 
cal factor for the success of E-commerce. Such a payment scheme must satisfy the 
following requirements [Ahuj96]. 

• Strong authentication of each party using certificate and digital signature 

• Privacy of transaction using encryption 

• Transaction integrity using message digest algorithms 

• Nonrepudiation to handle disputes about the transaction 
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There are many classification methods for E-payment schemes, many of them rather 
orthogonal to each other. The following list shows an example of many classification 
criteria, most of which are described in detail in the report of the ASPeCT protocol 
[ASP97], 

• Electronic purse/cash/credit 

• On-line/off-line 

• Credit-based/debit-based 

• Software-based/tamper-resistant hardware 

• Macropayment/micropayment 

The public -key based security protocols for mobile users are not likely to be deployed 
in the early stage of future mobile communications such as IMT2000 [Buha97, 
ITU97]. They will be introduced into the systems when a fully-fledged public key 
infrastructure is available which may yet take a number of years. Therefore the as- 
sumption of limited use of tamper resistant devices by mobile users for E-payment in 
the near future is very likely and reasonable. Payment takes place using hashing 
mechanisms which are very cheap in computation while the broker can remain off- 
line. In mobile communication environments, there is already a well established infra- 
structure for billing users. This means that we do not need to establish extra clear- 
ing/banking infrastructure for mobile E-payment. A suggested model for billing users 
for micropayments is presented in Eigure 2. 




On-line 

Off-line 

Fig.. 2: Billing of Micropayments 

The role players in the mobile micropayment shown in Eigure 2 comprise mobile 
users, service providers (SPs) and value added service providers (VASPs). Here, a SP 
plays the role of the broker in general micropayment environments. It bills the user 
for both basic and value-added services, and then redeems the relevant payment to the 
VASP. Considering the light-weight nature of most transactions to be carried out 
through mobile communications, the VASP-SP interface will be usually off-line. 
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3 Micropayments Using Hash Chains 

The most widely studied and promising approach involves using a public-key signa- 
ture together with a hash-chain. Four similar schemes have been proposed: PayWord 
[RiSh96], IKP’s micropayment [HSW96], Netcard [AMS95], and Pedersen’s scheme 
[Pede95]. The basic idea is that a signature value generated using a public key opera- 
tion is spread over many other cryptographic values derived by much more efficient 
one-way functions such as hash functions. In other words, the effect of a digital sig- 
nature is reused many times over subsequent messages (containing preimages of a 
specific hash). This mechanism was originally proposed for use in an authentication 
scheme by Lamport [LampSl]. The following description of the hash chain scheme is 
based on PayWord proposal of Rivest and Shamir [RiSh96]. 



3.1 Issuing User Certificate 

• The user U establishes an account with a broker B. U supplies personal details to 
B, such as a credit card number and delivery address, together with U’s public key, 
Ku- U’s aggregate charges will be charged to her credit-card number. 

• The broker issues to U a PayWord Certificate, which is a signed statement by B 
containing: 

- broker’ s name B 

- user name U 

- user’s IP-address 

- user’ s public key 

- expiration date ExpDate 

- other information, possibly including user-specific information such as: 

• a certificate serial number, 

• credit limits to be applied per vendor, 

• information on how to contact the broker, 

• broker/vendor terms and condition. 

The user’s certificate has to be renewed by the broker regularly (e.g. monthly); the 
broker will do so only if the user’s account is in good standing. 



3.2 Typical Scenario 

The user’s certificate authorizes the user to make Pay word chains, and assures ven- 
dors that the user’s pay words are redeemable by the broker. When the user U wishes 
to make a micropayment she clicks on a link to a vendor V’s charged web page. 

• The user’s browser determines whether this is the first request to V that day. 

- For a first request, U computes and signs a commitment to a new user-specific 
and vendor-specific chain of payments c,, c^, . . ., c„. 

• The user creates the payword chain in reverse order by picking the last 
payword at random, and then computing 
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C; = ) for ;■ = Ai I, N 2, . . 0 . 

Here is the root of the payword chain, and is not a payword itself. The 
commitment contains the root c^, but not any payword c for i > 0. 

• commitment M = { V, UCert, c^, Date, Otherinfo} 

• commitment includes both identities of the user and the vendor, and so is 
both user- specific and vendor- specific. 

- The user provides this commitment and her certificate to the vendor V, who 
verifies the signatures. 

• The i-th payment (for i = 1,2,... ) from U to V consists of the pair (c^. , i), which V 
can verify using c. 

• At the end of each day, V reports to the broker B the last (highest-indexed) pay- 
ment (w; ,/) received from each user that day, together with each corresponding 
commitment. 

• The broker charges subscription and/or transaction fees. 

Figure 3 shows the generation of hash-chain and commitment in the above scheme. 






'■N-1 



^N-2 



'■N-3 



Identites of 
user and vendor 




Timestamp 
and/or challege 




from vendor 








Digital Signature 



User’s 
private key 



Commitment to vendor 



To vendor 



Fig. 3: Hash Chain Protocol 



3.3 Problems with Existing Hash Chains Schemes 

There are a number of significant overheads which are implied by this model. We 
would like to emphasise that these are not currently present for most mobile users. 

• The user must have the capability of public key operations in her mobile commu- 
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nications device. 

• The user must have the ability to generate, or acquire, a suitable public key. 

• There must be some mechanism for users to be able to revoke certificates when 
they are compromised, and for merchants to know that certificates have been re- 
voked. 

Wireless communications systems in the current generation use only symmetric key 
based cryptography [Mehr97, Moha96]. This is the main problem with the wireless 
application of existing micropayment schemes based on hash-chains. In Section 5 we 
will examine how to include hash chains into a symmetric cryptography setting so as 
to be able to benefit from the idea in the current wireless context. 



4 SVP Scheme Based on Tamper- Resistant Device 

An alternative to use of a digital signature for micropayments is to employ a tamper- 
resistant device together with symmetric key cryptography. One such scheme called 
Small Value Payment (SVP) was proposed by Stern and Vaudenay [StVa97]. It aims 
to provide an even cheaper and more effective micropayment scheme than the ap- 
proach using hash chains, by avoiding the use of asymmetric key cryptography. In- 
stead, it requires the use of tamper-resistant devices both at the consumer and the 
merchant sides. We note that current mobile communications systems include use of 
a smart card which provides a degree of tamper resistance. The SVP scheme is illus- 
trated in Figure 4. 




Fig. 4: Small Value Payment Protocol 



Initialization. The broker B generates its own secret key and communicates it in a 
secure way to the device of each merchant, where is a common value to all mer- 
chants. It also generates and computes a random value t and a spending key 
s = MaCj^^ (C,t) for a consumer C, where s is unique to the consumer. In this equa- 
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tion Mac is a message authentication code which should have the property that it 
cannot be formed without knowledge of the key and it is infeasible to find a differ- 
ent input (C, t) which gives the same output s. Both s and t are given by B to C as 
authorisation for C to spend an agreed amount of money. 



4.1 Payment Protocol 

The payment protocol can be described as follows. In this description /j is a hash 
function which can be regarded as a MAC operation with the spending key s as the 
MAC key. 

C: consumer, M: merchant 

1. C ^M: C, t, r (r: random number chosen by C) 

2. C M: q (q: random challenge) 

3. C ^ M: V = h{C, M, c, q, r, s) (c; microamount) 

M: checks if v = h{C, M, c, q, r, Macj^^ (C, t ) ), 

keeps an account balance for the user and increases C’s account by c, and 
(optionally) stores it, q, r, v) if he is suspicious about this payment. 



From the response in message 3, M is able to determine whether C is in possession of 
the spending key s. If so M knows that C was authorised to make payments associated 
with the value t. Note that there is no mechanism to prevent C using s any number of 
times. In this sense SVP is a credit based scheme in which C is trusted to pay the bills 
accrued from use of s. 



4.2 Payment Clearing 

The merchant regularly sends the broker a statement of the amount of money spent by 
consumers, and the broker monitors to check if the accounts are consistent. If not, the 
broker requests a valid proof (C, c, t, q, r, v) of payment from M. If it cannot be pro- 
vided, the broker just refuses the payment and records that there must be a problem 
with C or M. If such a proof is released, the broker pays and checks if (M, q, r) has 
already credited to M. If it has, the broker suspects the merchant to be dishonest. If 
not, the broker stores (M, q, r) in the (C, t)-records. 



4.3 Problems with the SVP Scheme 

We have identified a number of problems with the SVP scheme which affect both the 
security and efficiency of a practical implementation in the wireless environment. 

• There is no signature from the user, and thereby the scheme does not provide 
non-repudiation (the merchant and/or the broker can generate all the security pa- 
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rameters). This is why the scheme depends heavily on the use of tamper resistant 
devices. The compromise of only one tamper resistant device in the merchant 
side enables an attacker to impersonate other consumers. 

• The shared secret key ks between the broker and all the merchants must be the 
same, because the user’s spending key is a function of the key ks- Every merchant 
(more precisely, its tamper-proof device) in transaction with the user must be able 
to compute the spending key. Stern and Vaudenay recognise this problem in their 
paper and suggested a solution in which the broker has several secret keys, a sub- 
set of which is shared with each merchant. Users must then obtain multiple 
spending keys and provide a corresponding subset of spending keys to the mer- 
chant. This solution adds storage and complexity to the scheme, while compro- 
mise of a set of merchants keys can still allow spending at different merchants. 

• The user and the VASP must execute the three-way challenge-response protocol 
for every micropayment, which is inefficient compared with the hash chain ap- 
proach exchanging only one message (preimage of a hash chain). 

• There is no authentication of the merchant to the user. The cost of three-way 
moves for each microamount and still lack of mutual authentication may not be 
justified even for micropayment environment. 

• Weakness in the message freshness: the mechanism of replay-detection against 
the merchant is vulnerable to the following attack scenario: 

- Broker resets all the account records periodically, e.g., every month. 

- Merchant reuses the old parameters (used In the previous months). 

- Broker checks if (M, q, r) has been used before, but the check cannot be ap- 
plied to all the transaction out of the manageable period. 

The last problem with regard to replay attack can be easily fixed by including date 
information (yymmdd) in the commitment computation procedure. This prevents a 
merchant from cheating the broker with old payment data received from the user 
previously. We have included such a mechanism in the new scheme described in 
Section 5. 



5 A New Scheme Using Tamper-Resistant Devices 

Exploiting the advantages of both the hash chain and the tamper-resistance schemes, 
we have designed a new scheme. We can avoid both the expensive asymmetric cryp- 
tography even for the payment initialization, and challenge-response for each pay- 
ment of microamount. 

Eigure 5 shows the required setting of this scheme assuming the use of tamper- 
resistant devices. The role players in this scheme are taken by those in mobile envi- 
ronments: the user, the VASP, and the user’s SP. There are three distinct kinds of 
shared secret keys: 

• between the user and the SP; 

• between the VASP and the SP; 

• between the user and the VASP. 
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In fact, the shared key is derived from the Ky^ which is common to every VASP. 
Also, for simplicity of key management, the user-SP shared key is computed 
using the user identity and a master key of the SP. 




Fig. 5 : Enhanced Payment Scheme 



The payment protocol assuming the use of hash chain is described in Figure 6. The 
user generates two separate commitments: one to the VASP, and another to the SP. 
The usage of pre-images of the hash chain is the same as in Figure 3. The computa- 
tion of commitment for the VASP uses the shared key K^y, the identities of the user 
and the VASP, random challenge ry and time-stamp TSy from the VASP, and the 
result of hashing c„. This commitment value is, in turn, input, together with the shared 
key between the user and the SP, to the commitment generation procedure for the 
SP. Note that by including the time-stamp which may be simply the date (yymmdd), 
this scheme is secure against the replay attack by the VASP which was possible in the 
scheme described in the previous section. The burden of computing the hash chain, if 
any, may be alleviated by reusing the previously generated hash chain in such a way 
that the remaining preimage with the smallest index is used for the commitment gen- 
eration. Summarizing, the setting of this scheme basically comes from the SVP 
mechanism proposed by Stern and Vaudenay, and the actual payment protocol from 
the hash chain setting. 
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Fig. 6: Payment Protocol 

An example payment protocol set based on this enhanced scheme is shown in the 
following. We first summarise the goals of payment initialization protocol as follows. 

• Mutual authentication between the user and the VASP 

• Authentic and secure key establishment 

• Mutual session key control 

• Weak non-repudiation of the user to both the VASP and the SP based on shared 
keys 

• Payment parameter initialisation 



5.1 Payment Initialization Protocol 



17: User, WASP, 5: SP 

1. U^V: U,ru 

2. U V: rv, h{ru, ry, Kuv), ch_data, TSy 

3. U — ^ V: Co, commitment_V, commitment_S 

U: commitment_V = h{U, V, Kuv, ru, ry, TSy, ch_data, Cq) 
commitments = h{commitment_V, Kus) 

Protocol description. 

• First message: the user sends the VASP his identity U and a random challenge ru- 
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• Second message: the VASP computes the common shared secret key Kuy using the 
user identity U and the share secret key Ky^, chooses a random challenge ry and 
computes h(ru, ry, Kjjy). It delivers to the user the random challenge Vy, h{ru, ry, 
K(jy), charging data ch_data and the time-stamp TSy. 

• Third message: upon receiving the second message from the VASP, the user com- 
putes h{ru, ry, Kuv), the value of which is compared with the received hash value 
from the VASP. The match of two values guarantees that the VASP has an authen- 
tic secret key Ky^. After that, the user computes the two commitments to the VASP 
and the SP using the secret keys Kuy and Kyg, respectively. The VASP checks the 
first commitment value by computing the same calculation as the user and com- 
paring the result with the received value. If both values match, then it has confi- 
dence that the user has the correct shared secret key Kuy. 



5.2 Payment Protocol 



17: User, WASP 

l.U-»V:c,- (j=l,...,N) 

Protocol description. 

When the user and VASP need to exchange the actual payment data for a unit of 
charged service, the user sends the relevant preimage of the hash chain. Note that the 
three-way challenge-response messages are not used but a simple tick (a preimage of 
the hash chain) is delivered from the user to the VASP when required. Therefore this 
scheme achieves a significant improvement in terms of signalling load from the tam- 
per-resistant device scheme proposed by Stern and Vaudenay. 



5.3 Payment Clearing 



V: VASP, S: SP, 

1. V ^ S: Co, Cj „,ax,j—max, U, V, ru, ry, ch_data, TSy, commitment_V, commit- 
ment_S 



Protocol description. 

After the transaction, the VASP stores the payment data for billing, which includes 
the user identity U, the user’s commitments to the VASP and the SP, and the required 
data for the verification of the signature, i.e., r^, r^, ch_data, TSy, c„, the last received 
pre-image c. and the corresponding index value j_max, which equals the total 
number of ticks paid by the user in the transaction. The SP checks the commitmentJS 
field by computing the same calculation as the user and comparing the result with the 
received value. The confidence gained by this check is to ensure that the commitment 
could not have been formed by the VASP, even if the tamper resistance of the 
VASP’s device has been compromised. 
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5.4 Comparison with SVP 

Compared to the original SVP protocol there are two main advantages that we can 
claim. 

• Firstly, the interactive three move protocol for every micropayment has been 
avoided. This can be an important issue for mobile communications where call 
charges are still large in comparison with Internet based communications. It also 
reduces delay and removes the possibility of incomplete payment protocols due 
to communications failures. 

• Secondly, mutual authentication between the user and the VASP is provided 
while keeping the protocol efficient in terms of computational and bandwidth re- 
quirements. 

If we compare the disadvantages of the SVP scheme with our enhanced version we 
see that all disadvantages have been overcome except that there is still no signature to 
provide non-repudiation of user payments. 



6 Conclusion 

In the foreseeable future, mobile communication terminals will be a major method for 
electronic commerce, at least in transactions of small amounts. The well-studied and 
efficiency-proven hash chain scheme relies on the existence of digital signatures and 
an associated public key infrastructure. In this paper the alternative of using a tamper- 
resistant device has been explored. It has been shown that the SVP scheme has limi- 
tations for use in this environment. We have proposed an improved scheme which 
provides much reduced risk at no significant additional cost. We suggest that our 
enhanced scheme is suitable for implementation in current wireless communications 
systems. 
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Abstract. This paper gives new examples that exploit the idea of us- 
ing sparse polynomials with restricted coefficients over a finite ring for 
designing fast, reliable cryptosystems and identihcation schemes. 



1 Overview 

The idea of using polynomials with restricted coefficients in cryptography, 
though fairly new, has already found several cryptographic applications such 
as the NTRU cryptosystem [10], the ENROOT cryptosystem [6], the PASS 
identification scheme [9,11], and the SPIFI identification scheme [2]; see also [8]. 

In contrast to the constructions of NTRU and PASS, which consider classes 
of low-degree polynomials with many “small” nonzero coefficients, ENROOT 
and SPIFI are based on the use of polynomials of high degree that are extremely 
sparse. Although these latter constructions were originally considered only over 
finite fields, in this paper we improve and extend the ideas of [2,6] and show that 
both ENROOT and SPIFI can be generalized to the setting of an arbitrary 
finite ring. In this generality, the user can be assured of an extra degree of 
security by selecting rings in which the problem of solving polynomial equations 
is notoriously difficult, as in the case of the residue ring for an RSA-modulus 
M = pi, where p and I are two privately held primes. In this paper, we have 
also introduced several new security features for the ENROOT and SPIFI 
protocols. Quite recently several powerful attacks on the original versions of 
ENROOT and SPIFI and some of their modifications have been presented 
in [I]. In particular the present version is a result of our iterative, thanks to 
many fruitful discussions with the authors of [1]) attempts to make ENROOT 
and SPIFI resistant to attacks of the types described in [I]. Although we believe 
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that there is no immediate “danger”, it still seems that these attacks still present 
a serious threat to ENROOT and SPIEL Nevertheless, these objects, sparse 
polynomials, look too nice and natural (easy to evaluate but hard to invert) not 
to try to use them for a public key cryptography. We hope that our paper may 
help to bring more attention to this area. 



2 Notation and Definitions 

Throughout this paper, log z denotes the binary logarithm of z. 

Let 'R, be an arbitrary finite ring, and let N denote a fixed multiple of the 
exponent exp(7^^ ) of the multiplicative group of units TV ^ ; thus we have = 
a for all a G TZ^ U {0}. 

To illustrate our ideas below, we will sometimes consider two important spe- 
cial cases, which we refer to as the “Fg-case” and the “^M-case.” In the F^-case, 
TZ is the finite field F^ with q elements, and we can take N = exp{TZ^) = q — 1. 
In the Zrn-case, TZ is the ring 7Z/M7Z, of residue classes with respect to an 
RSA modulus M = pi, where p and I are primes. In this case, we can ei- 
ther take N = p{M) = (p — !)(/ — 1), where (p is the Euler function, or 
N = X{M) = lcm(p — 1,/ — 1) (A is the Carmichael function). We remark 
that in all of these cases, we have log \TZ\ « log \RC \ « log N, 

We also assume that any element of TZ can be encoded by using about log \TZ\ 
bits. 

Let d be a fixed positive integer. Given a set 5 C TZ, we say that a polynomial 
f{xi , . . . , Xd) G TZ[x\, . . . , Xd] is an S -polynomial if every coefficient of / belongs 
to S. 

An expression of the form ax^^ . . . x^f‘ we call a monomial with the coefficient 
a and the exponent (ei, . . . , e^). 

Finally, we say that a polynomial f{x \, . . . , Xd) G TZ[x\, . . . , Xd\ is t - sparse if 
/ has at most t nonzero coefficients. 

3 The SPIFI Identification Scheme 



In this section, we describe a generalization of SPIFI (for Secure Polynomial 
IdentiFIcation; see [2]) for an arbitrary finite ring TZ. For the sake of simplicity 
and practicality, we work only with polynomials of a single variable (that is, 
d=l). 

3.1 A Hard Problem 



The hard problem underlying our one-way functions can be stated as follows: 
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Given 2m arbitrary elements ai, . . . am, (3\, ■ ■ ■ , Pm G ^ and a set S CTZ 
of small cardinality, it is not feasible to find a r-sparse S-polynomial 
f{x) € TZ[x] of degree deg(/) < N with f{aj) = Pj for each j = 1, . . . ,m, 
provided that N is of “medium” size relative to the choices ofm> 1, the 
cardinality |5|, and r > 3. 

More precisely, we expect that if one fixes the number of points m, the 
cardinality |5|, and the sparsity r > 3, then the problem requires exponential 
time as -/V — >■ oo (that is, exponential with respect to the bit length of N). 

For example, let p be a prime, and consider the case where TZ is the finite field 
IFp with p elements. Let = a* (mod p) and bj = Pj (mod p) be chosen 
so that 0 < Oijjbj < p — 1 for i = 0, . . . ,p — 1 and j = 1, . . . , m. Then in this 
simplified situation, the hard problem above is still equivalent to the problem of 
finding a feasible solution to the integer programming problem 

p—1 

'^XiSiatj +yjp=bj, j = ^£i<r, 

2=0 2=0 

where yj € TZ, Xi € S, and £j G {0, 1} for all i and j. 

3.2 Basic Idea 

We fix the ring TZ and some integer parameters fc > 1 and r,s,t > 3. This 
information is made public. The value of N may be kept private. Only Alice 
needs this value, so in this scenario the choice of the ring TZ (and the value of 
N) can be made by Alice. 

In addition we require that TZ contains elements of multiplicative order in 
the interval 27V^/^]. This certainly imposes some additional number 

theoretic requirements on N which in practice are easy to satisfy. 

To create the identification message Alice uses the following algorithm, which 
we still denote by SPIFI. 

Initial Set-up 

Step 1 Select at random k distinct elements ao,...afc_i G TZ^ where uq of 
multiplicative order in the interval 

Step 2 Select a random |"t/2]-sparse {0, l}-polynomial fi{x) G TZ[x\ with 

deg(/i) < N and /i(oo) G TZ^ . Next, select a random [t/2j-sparse {0,11- 
polynomial / 2 (x) G TZ[x] with deg(/ 2 ) < N, / 2 (ao) ^ 0 and / 2 (ao) yf 
— ./’i(oo)- 

Step 3 Compute A = — / 2 (ao)/i(oo)~^ and put f(x) = Afi(x) + f 2 {x). Then 

/ is a f-sparse (0, 1, A}-polynomial with deg(/) < N, and /(oq) = 0. The 
polynomial / is the private key. 
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Step 4 

Compute Cj = f{aj) for j = 1, . . . , fc — 1. 

Step 5 

Publish the set of values {A, uq, . . . ak~i,Ci, . . . , Ck~i\ as the public key. 

To verify Alice’s identity, Alice and Bob use the following procedure. 

Verification Protocol 
Step 1 

Alice selects a random r-sparse {0, l}-polynomial g{x) € TZ[x] with deg(g) < 
N and g{0) = 0 and random [s/2j-sparse {0, 1, B}-polynomial hi{x) G TZ[x] 
of degree deg(/ii) < N with B ^ 0,1 or A, computes 

Dj=g{aj), j = 

and sends the sum D = Di + . . . + D^-i to Bob. 

Step 2 

Bob selects a random s — [s/2j-sparse {0, 1, i?}-polynomial h 2 {x) £ TZ[x\ of 
degree deg(/i 2 ) ^ N which has no common monomials with hi, computes 
h = hi + h 2 and sends h to Alice. 

Step 3 

Alice verifies that h is an s-sparse {0, 1, B}-polynomial which contains all 
the terms of the polynomial hi, computes 

F{x) = f{x)g{x)h{x) (mod x^^^ — x) 

and sends the polynomial F and {Fi, . . . , F^-i} to Bob. 

Step 4 

Bob computes 

Fj h(^Q,j), j 1, . . . ,k 1, 

and verifies that 

Di + . . . + Dk-i = D 

and that F{x) is an rst-sparse {0, 1, A, B, Ai3}-polynomial with deg(F) < N, 
F{ao) = 0, and 



F{aj) = CjDjEj, j = l,...,k-l. 

Of course, there is a chance that the constructed polynomial F{x) is not 
a {0, 1, A, B, Ai?}-polynomial; however, if rst is substantially smaller than N, 
then this chance is negligible (and in this case, Alice and Bob can repeat the 
procedure). 
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3.3 Efficiency 

The sparsity of the polynomials involved guarantees computational efficiency 
for this scheme. Using (naive) repeated squaring, one can compute the power 
a® for any a € TZ and 0 < e < in about 2 log N arithmetic operations in 
TZ in the worst case, or about 1.5 log arithmetic operations “on average”; see 
Section 1.3 of [3], Section 4.3 of [4], or Section 2.1 of [5]. Consequently, any 
T-sparse polynomial f{x) G TZ[x\ of degree at most N can be evaluated at any 
point in about 0(r log A^) arithmetic operations in TZ. 

We recall that any element of TZ can be encoded by using about log \TZ\ bits. 

Finally, we remark that if 0 G 5 C 7?., then any r-sparse 5-polynomial 
f{x) G TZ[x\ of degree at most N can be encoded with about rlog(A^|5| — N) 
bits. To do this, we have to identify at most r positions at which / has a nonzero 
coefficient. The encoding of each position requires about log N bits, and for 
each such position, about log(|5| — 1) bits are then required to determine the 
corresponding element of S. 

For example, the encoding of the values of B, Di, . . . , Dk-i and the sum 
D = Di + . . . + Dk-i requires about (fc -I- 1) log |7^| bits. The identification 
message must encode rst positions of the polynomial F (corresponding to its 
nonzero coefficients), which takes about rst log A^ bits. Each position requires 
two additional bits to distinguish between the possible nonzero coefficients 1, 
A, B and AB. Because of the same reason the encoding of the polynomial hi 
requires [s/2j log(4A^) bits. Hence the total number of bits which are sent from 
Alice to Bob is about [s/2j log(4A^) -|- (fc -I- 1) log \TZ\ + rstlog{4N). 

Putting everything together, after simple calculations we derive that (using 
the naive repeated squaring exponentiation) 



o the initial set-up takes 0{ktlog N) arithmetic operations in TZ; 
o the private key size is about tlog(2A^) bits; 
o the public key size is about k{log \TZ\ log \RA |) bits; 

o the identification message generation, that is, computation of the polynomial 
F, elements Dj, j = 1, . . . , fc — 1, and their sum D, takes 0{rsf) arithmetic 
operations with integer numbers in the range [0, 2A^] and O {{k — l)rlog A^) 
arithmetic operations in TZ; 

o the identification message size is about rst log(4A^) -|- fclog \TZ\ bits; 
o the identification message verification, that is, computation of Di-\-. . .-\-Dk-i, 
F{aj) and the products CjDjEj, j = 1, . . . ,k—l, takes about O (krstlogN) 
arithmetic operations in TZ. 

We remark that the practical and asymptotic performance of the SPIFI 
scheme can be improved if one uses more sophisticated algorithms to evaluate 
powers and sparse polynomials; see [3,4,5,13,15]. In particular, one can use pre- 
computation of certain powers of the Oj, j = 1, . . . , k— 1, and several other clever 
tricks which we do not consider in this paper. 
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3.4 Possible Attacks 

It is clear that recovering or faking the private key (that is, finding a t-sparse 
{0, 1, A}-polynomial polynomial f{x) G TZ[x] with /(oq) = 0 and f{aj) = Cj for 
j = 1, . . . , fc — 1) or faking the identification message (that is, finding a rst-sparse 
{0, 1, A, _B, A_B}-polynomial F{x) G TZ[x] with F{ao) = 0 and F{aj) = CjDjEj 
for j = 1, . . . , A: — 1) are versions of the hard problem mentioned in Section 3.1 
(with slightly different parameters). 

We also remark that that without the reduction 

f{x)g{x)h{x) (mod x^~^^ — x), 

one of the one possible attacks might be via polynomial factorization. In a practi- 
cal implementation of this scheme, one should make sure that both / and g have 
terms of degree greater than N/2 so there are some reductions. Even without the 
reduction modulo x^~^^ — x, the factorization attack does not seem to be feasible 
because of the large degrees of the polynomials involved; all known factorization 
algorithms (as well as their important components such as irreducibility testing 
and the greatest common divisor algorithms) do not take advantage of sparsity 
or any special structure of the coefficients; see [4,14]. Moreover, the first factor 
that any of these algorithms will find would be the trivial one, that is, (x — uq). 
But the quotient F{x)/{x — ao) is most likely neither sparse nor an 5-polynomial 
for any small set S. Finally, we remark that if one works in the setting of a ring 
TZ that is not a field (such as the Z M~case) , then the problem of factorization 
becomes much more complicated, so this type of attack is even less likely to 
succeed. 

It is possible that by using some “clever” choice of polynomials h, after several 
rounds of identification, Bob might be able to gain some information about /. 
But the polynomials g are specifically designed to prevent him from doing this. 
In Section 3.5 below, we present another idea which should render this attack 
completely infeasible, at least in the F^-case. 

One might also consider lattice attacks. In particular, one can try to select a 
rt-sparse {0, 1, A}-polynomial e(X) G TZ[x\ with e(ao) = 0, compute 

Dj = e{aj)C~^, j = 
and then send these values together with 

F{x) = e{x)h{x) (mod x^^^ — x) 

to the verifier. In principal this attack could succeed but finding such a polyno- 
mial e is kind of knapsack problem and since the dimension of the corresponding 
lattice would be equal to the (very large) degree JV of the polynomials involved, 
any such attack seems completely infeasible at this time. With current technol- 
ogy, one can reduce lattices of degrees only in the hundreds, while in a practical 
implementation of this scheme our lattices will have dimension JV of much large 
order of magnitude. Another attempt to construct such a polynomial e could be 
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via solving the discrete logarithm in TZ to base oq, see [1]. However because oq 
is selected to be of small order this attack is very unlikely to succeed either. 

There is also another way to avoid the above attack via constructing a rt- 
sparse {0, 1, H}-polynomial e(X) G TZ[x\ with e(oo) = 0. This way does not 
require any restrictions of the order of Uq and thus can be used for arbitrary N . 
Namely we request that for each of the polynomials f{x), g{x) and h{x) the sum 
of the degrees of the monomials is divisible by N . In this case the same condition 
also holds for F(x). Indeed, if an s-sparse polynomial and a t-sparse polynomial 
have monomials of degrees ni, . . . , rig and mi, . . . , mt, respectively, with 

S t 

rii = 0 (mod N) and mj = 0 (mod N) 

i=i j=i 

then their product has st monomials rii + mj, i = 1, ... ,s, j = 1, ... ,t (unless a 
collision occurs which is very unlikely). Then 

si s / t \ 

i=i j=i i=i \ j=i j 

s t 

= t rii + s mj = 0 (mod N) 
i=l J=i 

as claimed. Therefore the aforementioned discrete logarithm attack from [1] is 
very unlikely (with probability about 1/fV) to produce a polynomial e(x) which 
besides the aforementioned conditions also has the sum of the degrees of the 
monomials which is divisible by JV. 

We remark that, although using composite moduli may add some additional 
security features, the security of the SPIFI scheme is not compromised even if 
the factorization of M is known. In fact, we believe that even in the case where 
the modulus is a (sufficiently large) prime (that is, in the F^-case), the scheme 
is still very secure. 

It has turned out that Alice must make some commitment about the values 
of Di, . . . , Df^_i before she receives the polynomial h from Bob, otherwise there 
is a very simple attack on this scheme. On the other hand, sending the whole 
set to Bob before he selects his polynomial h may open some ways of attacking 
for “cheating” verifier. Sending the sum D = D\ + . . . + D^-i is just one of 
many possible ways for Alice undertake some commitments about the values 
of Di, ... , Dk-i (just reducing the probability of the aforementioned “on-line” 
attack to 1/iV). Probably a more practical way would be just sending about a 
half of the bits of I?i, . . . , Dk-i at Step 1 (instead of computing and sending D) 
and then sending the rest of the bits at Step 3 (just reducing the probability of 
the aforementioned “on-line” attack to about 

Moreover, the SPIFI scheme is easily modified so that the value N = ip{M) 
or A(M) (see Section 2) remains secret. Indeed, Alice can choose g in Step I of the 
verification protocol so that the reduction modulo x^~^^ — x that occurs in Step 3 
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produces a polynomial whose degree is not “too close” to N. In fact, “on average” 
it should be about iV(l— l/2rst) for the SPIFI scheme and about iV(l — l/2i?) for 
the ENROOT scheme (see Section 4 below) since the corresponding polynomials 
are rst-sparse and i?-sparse, respectively. Thus, in the case that N = ip{M), the 
degree of F gives a worse approximation to N than the value of M itself, at least 
a M = pi is a, product of two primes of the same order. 



3.5 Modification of the Basic Scheme 

In this subsection, we consider only the case of a finite field TZ = F,. It is very 
likely, however, that these ideas can be generalized to the setting of an arbitrary 
ring. 

In order to prevent Boh from gaining any useful information about / by 
selecting certain special polynomials h, Alice can initially construct two poly- 
nomials fi and / 2 , either one of which can serve as her private key. That is, 
for some A, Ci, . . . , Ck-i G with A ^ 0,1 and distinct oq, . . . , Uk-i G F^, 
Alice can find two t-sparse {0, 1, A}-polynomials fi{x),f 2 {x) G Fg[x] of degree 
at most N that satisfy 

/i(aj) = /2(oj) = Q, j = 0, ...,fc-l, (1) 

for some Cq, Ci, . . . , Ck-i G F,. 

To do this, Alice selects a certain parameter n and considers certain random 
{±1, ±T}-polynomials ’ipix) of degree 4n, looking for roots in F^. For values of 
n of reasonable size this can be done quite efficiently, at least in probabilistic 
polynomial time; see [4,14]. 

It follows from Theorem 3 of [12] that for sufficiently large q, the probability 
that a random monic polynomial of degree 4n over Fg will have fc -I- 1 distinct 
roots in Fg is given by 



Pk+i{4n,q) = 



OO 

E 



9 \ 

m 



4n—rr 

‘E 

1=0 



(- 1 ) 



I 



In particular. 



lim lim Pk+i(4n,q) 

n—^oo q—^oo 



1 

(fc -I- 1) ! ' 



Letting A vary randomly over Fg/{0, 1}, Alice considers {±1, ±A}-polynomials 
ip{x) G Fg[x] of degree 4n which have n coefficients of each type 1, —1, A or 
—A. Since 

o'! ■ 



is large, after 0{{k + l)!e^“*'^) trials Alice will find with high probability such a 
polynomial with fc -|- 1 distinct roots ag, ... ,Ok G Fg. By reordering if necessary, 
we can assume that uq, ... , Uk-i are distinct elements in the multiplicative group 
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. Alice now writes = V’i{x) — (p 2 {x) where tpi, (p 2 are 2n-sparse {0, 1, A}- 
polynomials of degree at most 4n+ 1, and each (pi has n coefficients of each type 
1 or A. Moreover, pi ^ tp 2 , but clearly 

ipi{aj) = ip2{aj), j = 0,...,k-l. 

Now Alice selects a random {\t/2\ — n)-sparse {0, l}-polynomial ^’ 1 ( 2 ^) G 
with deg(^i) < g— 1 and V’i(oo) ^ 0- Alice then selects a random ([t/2j — 
n)-sparse {0, l}-polynomial ip 2 {x) G F,[x] with deg(^ 2 ) < g — 1- Assuming that 
ipi and 1^2 have been selected so that the non-constant monomials that occur in 
them have degree greater than 4n -|- 1, Alice can now define 

fi{x) = Aipi{x) + tp 2 {x) + Pi{x), 1=1,2. 

Then fi and /2 are t-sparse {0, 1, A}-polynomials in the correct form for the 
SPIFI protocol, and they satisfy (1) some Cq,Ci, . . . ,Ck-i G F^ We remark 
that in this case the value Cq = fi{ao) = / 2 (oo) must be published as well 
(although the scheme can easily be modified in such a way that as before /i (oq) = 
/ 2 (ao) = 0 thus this value need not be sent). 

Now Alice can alternate between fi and /2 in a random order to confound 
Bob’s attempts to gain useful information about the private key. 

It is easy to see that instead of the sum Atpi{x) + ip 2 {x) + (fii{x) for t = 1, 2, 
one can also consider more complicated expressions involving {0, l}-polynomials. 
For example, one can put 

fi{x) = A'ipi(x) + ip 2 {x) + pi{x)(p{x), i = 1, 2, 
for appropriately chosen {0, l}-polynomials ipi{x) , 'ip 2 {x) and (p{x). 



3.6 Remarks 

It is natural to try to construct and utilize more than two t-sparse {0, 1, A}- 
polynomials that take the same values at k distinct points. However our approach 
of Section 3.5 does not seem to extend to this case. 

Although we have not done so here, it can be interesting to extend our 
construction to multivariate polynomials. 



4 The ENROOT Cryptosystem 



In this section, we describe a generalization of ENROOT (for ENcryption with 
ROOTs; see [6]) for an arbitrary finite ring TZ. We will now consider polynomials 
in TZ[xi , . . . , Xd\, where c? > 2 is fixed. Accordingly, we will often employ vector 
notation, writing f{x) for /(xi, . . . , Xd), iVix] for 7^[xi, . . . , xj\, etc. 
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4.1 Another Hard Problem 

Our one-way functions are based on the following hard problem: 



Given the r-sparse polynomials fi{x), . . . , fd{x) C TZ[x] of degree at 
most N , it is not feasible to find an element a = (oi, . . . , aj) G TZ‘^ with 
fj{a) = 0 for j = 1,. . . ,d, provided that N is sufficiently large relative 
to the choices of d> 2 and r > 3. 

Again, we expect that if one fixes the number d > 2 and the sparsity r > 3, 
then the problem requires exponential time as A — >■ oo (see Section 4.4 below). 



4.2 Basic Idea 

We fix the ring TZ and the integers d > £ > 3, Sj,tj > 3, j = 1, . . . ,d such that 
ti = . . . = ti. This information is made public. The value of N may be kept 
private. In fact, only Bob needs this value so in this scenario the choice of the 
ring TZ (and thus the value of N) is made by Bob. 

The algorithm ENROOT can be described as follows. 

ENROOT Algorithm 

Step 1 

Alice selects d random elements oi, . . . , G TZ^ which form her private key. 

Step 2 

Alice selects d random polynomials hj{x) G TZ[x], of degree at most |7^|, 
containing at most tj — 1 monomials, j = 1, . . . , d, such that the first £ poly- 
nomials hi{x ), . . . , he{x) have the same set £ of exponents of their monomi- 
als. 

Step 3 

Alice publishes the polynomials fj{x) = hj{x) — hj{a) for j = 1, ... ,d as 
her public key, where a is the vector (oi, . . . , Ud) G {TZ^)'^. 

Step 4 

To send a message m G TZ, Bob selects d random polynomials gj{x) G TZ[x\ 
of degree at most N , containing at most Sj — 1 monomials such that one 
monomial has an exponent from the set £ and having nonzero constant 
coefficients. Bob then computes the reduction F{x) of the polynomial 

m + fi{x)gi{x) -f . . . -f fd{x)gd{x) 

modulo the ideal in TZ[x] generated by — x\,. . . , — Xd}, and he 

sends F{x) to Alice. 

Step 5 

To decrypt the message, Alice simply computes F{a) = m. 
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4.3 Efficiency 

The sparsity of the polynomials involved again provides computational effi- 
ciency for this scheme. Using repeated squaring, one can compute the monomial 
a^U . . for any (oi, . . . , ad) G TZ'^ and 0 < Cj < \R\, j = 1, . . . ,d, in about 
0((ilog(2|7?,|)) arithmetic operations in TZ. Consequently, any r-sparse polyno- 
mial f{x) G TZ[x] of degree at most \TZ\ can be evaluated at any point in TZ^ in 
about 0{Td\og{2\TZ\)) arithmetic operations in TZ. 

We remark that any r-sparse polynomial f{x) G TZ[x] of degree at most \TZ\ 
can be encoded with about r((d-|- 1) log |7^|) bits. To do this, we have to identify 
at most T monomials for which / has a nonzero coefficient. The encoding of each 
monomial requires about dlog \TZ\ bits, and for each such monomial 

about log \TZ\ bits are then required to encode the coefficient. 

Let us set 

d d d 

= R = '^tjSj, Q = {d+l)log\TZ\. 

j=i j=i i=i 

Then after simple calculations we derive that (using the naive repeated squaring 
exponentiation) 

o generation of the public key. to produce the vector a requires 0{d\og\TZ\) 
random bits; to construct the polynomials hj(x) requires the generation of 
another (T — d)Q random bits; the computation of the hj{a), j = 1, . . . , d, 
takes 0{Td\og{2\TZ\)) arithmetic operations in TZ; 

o the private key size is about dlog \TZ\ + {T — d)Q bits; 

o the public key size is about TQ bits; 

o cost of encryption: to construct the polynomials gj{x) requires the generation 
of about dlog |7^|-|-(S' — d)Q random bits; the computation of the polynomial 
F(x) requires about R arithmetic operations in TZ plus Rd additions in 
7Z/N7Z; 

o the size of the encrypted message is about RQ bits; 

o the cost of decryption: the evaluation of F{a) = m takes about 0{Rdlog{2N)) 
arithmetic operations in TZ. 

In the Fq-case, the above scheme can be accelerated if Alice sets ci = 1, 
selects a random element a € TZ^ and d — 1 random exponents 62 , ... , 6 ^ G 
2Z j{q— \)2Z, and defines a as (a®L • • ■ G {TZ^)‘^. 

Again we mention that the performance of the ENROOT algorithm can 
be improved if one uses more sophisticated algorithms to evaluate powers and 
sparse polynomials; see [3,4,5,13,15]. 

Another possible way to improve performance is to use at Step 4 only k < d 
randomly selected polynomials from the set {fi, . . . , fd}. For the same level of 
security, there will be a trade-off between the complexity of Step 2 (hence the 
size of the private key) and the complexity of Step 4. 
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4.4 Security Considerations 

The obvious way to attack the ENROOT cryptosystem is to try to find a 
simultaneous solution to the system of polynomial equations 

fj{x)=0, j = (2) 

which amounts to solving the hard problem in Section 4.1. All known algorithms 
to solve systems of polynomial equations of total degree n require (regardless 
of sparsity) an amount of time that is polynomial in n (that is, exponential 
time with respect to the bit length of n); see [7,14]. Since the degrees of the 
polynomials in (2) will be very large in practical implementations (about the 
size of N), this attack is totally infeasible. 

Another possibility is to simply “guess” a solution. One expects that a system 
of r-sparse polynomial equations in d variables of high degree over TZ will have 
very few zeros if d > 2, even though the number of zeros of a polynomial over an 
arbitrary ring is not necessarily bounded by the degree. Working heuristically, 
if we view the vector of polynomials f{x) = {fi{x),...,fd{x)) as defining a 
“random” map f : TZJ^ — >■ then the expected number of roots common to all 

of the polynomials fj{x) (that is, the cardinality of the kernel of /) is given by 

1 _ (1 _ |7^|-rf)l^l" ^ 1 - e-i ^ ’ 

hence this brute force attack will take roughly 0.245177.1^^ trials “on average” to 
succeed. For arbitrary rings, we expect the choice d> 2 will provide the 2®° level 
of security against this attack if N is at least 50 bits long. 

Although it is tempting to choose d = I = 1, in this case there are more 
sophisticated attacks that provide some information about Alice's private key. 
One of these is based upon consideration of the difference set of the powers of 
monomials occurring in the polynomial F{x). Indeed, if 

t S 

f{x) = ^ Ajx”* and g{x) = ^ 

i=i i=i 

are the polynomials selected by Alice and Bob, respectively, with ni = mi = 0 
(and such that the sets {ni, . . . , rit} and {mi, . . . , m^} have a reasonably large 
intersection) then F(x) contains at most t < st monomials C^x'^''' , where 

Tv = rii + mj (mod N), v = 1, ... ,t, 

for some i = 1, . . . ,t and j = 1, . . . , s. 

Consequently, finding the repeated elements in the difference set 

A = {r,,-rr, (mod A^) 1 = 1, . . . ,r} , 

may reveal some information about the polynomial f{x). 
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In addition, if d = 1, one can also compute the greatest common divisor 
of f{x) with — X. Since this polynomial will have lower degree than / in 

general, it would be easier to find a root from a theoretical standpoint. Although 
it is not clear how to do this in an amount of time that is polynomial in the 
sparsity rather than in the degree of f(x), it remains a potential threat. 

On the other hand, these attacks on ENROOT fail when d > £ > 2. In- 
deed, the first attack may help to gain some information about the total set 
of monomials in all of the polynomials /i, . . . , fd, but it does not provide any 
information about the individual polynomials since it is impossible to determine 
which monomial comes from which product fjQj, j = 1, . . . ,d. In order to try 
all possible partitions into d groups of Sjtj monomials, j = 1, ... ,d, needs to 
examine ^ ^ 

^ = (sRi) ! . . . {.Sdtd) ! 

total combinations. In particular, the most interesting case is when .si,..., Sd 
are of approximately the same size as well as ti, . . . ,td, that is, when .Sj ~ s and 
tj ~ t for j = 1, . . . , d. In this case 

logV ~ st(d log d), 

hence the number V of combinations to consider grows exponentially with re- 
spect to all parameters, provided that d> £> 2. 

The second attack fails as well since the notion of greatest common divisor 
for multivariate polynomials is not defined, and taking resolvents to reduce to 
one variable is too costly. 

However the case d = 2 is still not secure. Indeed, in this case we have 
either d = £ = 2 or £ = 1. In either case there are very simple linear algebra 
attacks which do not apply when d > £ >2 which we believe to be completely 
secure against the aforementioned attacks. There are some other alternative 
ways to guarantee that there are sufficiently many common elements in the sets 
of exponents of monomials of fi, . . . , fd. In particular, the first few monomials 
of each polynomial h\, . . . ,hd can be selected the same exponents. 

Finally, we remark that the ENROOT cryptosystem is probably secure 
against lattice attacks for the same reason that the SPIFI scheme is secure 
(see Section 3.4): most lattice attacks would necessarily be based on lattices of 
dimension equal to the cardinality of \R\, and in practical implementations this 
number would be very large. 

4.5 Remarks 

The ENROOT cryptosystem is well-suited to private key sharing among multi- 
ple parties. For parameter choices and approximate runtimes in the IF^-case, we 
refer the reader to [6]. The main inherent weakness of this cryptosystem is its 
high message expansion cost. It is likely that working over certain noncommuta- 
tive rings or rings that are not principal ideal domains may improve the overall 
security, so that smaller rings could be employed. Working over these rings, it 
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would be interesting to have a more thorough security analysis than we have 
attempted here. 
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Abstract. The proposed scheme by Domingo in Electronic Letters presented 
the first anonymous fingerprinting scheme in which help of a registration center 
is not required in order to identify redistributors. However, registration protocol 
of this scheme is 4-pass and identification process also required many 
exponential operations. In this paper, we propose more efficient protocol than 
the scheme by Domingo, which require 2-pass in registration protocol and need 
only 3+1 times exponential operation in identification process. In the electronic 
commerce of digital contents, registration protocol is more efficient than the 
previous scheme introduced. Moreover, computational complexity is 
diminished since identification process requires only 3+1 times exponential 
operations as the previous one requires 3+N/2 exponential operations. We now 
show efficient anonymous fingerprinting of electronic information with 
improved automatic identification of redistributors. 



1. Introduction 

Fingerprinting schemes do not prevent illegal copies; however, they do not deter 
people from illegally redistributing digital contents by enabling the original merchant 
of the contents to identify the original buyer of a redistributed copy. General methods 
of copyright protection previously proposed are to use the encryption and access 
control techniques. But those techniques are capable of illegal copying after 
acquisition of permission to the digital contents. The newly proposed method of 
copyright protection is copyright marking techniques. Copyright marking is the 
technique that an ownership information is embedded into digital contents to prevent 
illegal copies. Methods of copyright marking techniques are generally to use digital 
watermarking and digital fingerprinting techniques. Digital watermarking is the 
method which same authentication codes are embedded into same contents; whereas 
digital fingerprinting is different form it, in that different authentication codes are 
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embedded into same contents. Since different authentication codes should identify 
redistributor of illegal copy. 

The subordinate classification of digital fingerprinting is traitor tracing that 
provides identification with used keys. Traitor tracing techniques are to detect 
redistributor of used keys at an image decryption in broadcasting system such as pay- 
TV. 

Before the emergence of computers, only physical fingerprinting had been studied 
and developed. With the increasing importance of digital contents, there is strong 
desire to use digital fingerprinting to protect intellectual properties, because it requires 
light-weight cryptographic capability but satisfies the purpose. Examples of digital 
contents to be fingerprinted include documents, images, movies, sounds and so on. 
Fingerprinting has a problem about collusions. Suppose digital contents are 
distributed with different fingerprints. If a collusion group that got those contents 
compare their copies, they can easily discover all the fingerprints. Therefore collusion 
group can remove these fingerprints, interpolate gaps, and resell the digital contents 
without worrying about being traced. 

This problem of collusions was first discussed by Blakley et al. [2] and solution 
against larger collusions was presented by Boneh and Shaw [3]. Low and Maxemchuk 
[4] presented a collusion analysis in their model for general multiparty cryptographic 
protocols. Explicit collusion-tolerant constructions were given in [5,6]. 

Traitor tracing is the equivalent of fingerprinting for cryptologic keys. It was 
introduced by Chor et al. [7] for broadcast encryption. For example, when digital 
movies are broadcasted in encrypted form, and only the decryption keys are sold, a 
different key is sold to each pay-TV subscriber. Furthermore, the encryption scheme 
is adapted so that all keys can be used to decrypt the same cipher-contents. Since 
decryption key is different for each subscriber, the pay-TV company can trace the 
redistributor who made illegal copies of his key. Naor et al. [8] introduced threshold 
traitor tracing a scheme. Boneh et al. [9] and Fiat et al. [10] introduced efficient 
public key and dynamic traitor tracing scheme. 

Recently, several studies enhance the functionality of fingerprinting scheme in 
various ways. Asymmetric fingerprinting was introduced by Pfitzmann and Schunter 
[11]. Unlike conventional fingerprinting schemes, only the buyer of a fingerprinted 
object knows the contents with the fingerprints. When a merchant finds the illegal 
copy, he can nevertheless identify the buyer and prove to third parties that this buyer 
bought the copy from him. Pfitzmann [12] also proposed a traitor tracing scheme 
using asymmetric fingerprinting. Anonymous fingerprinting was introduced by 
Pfitzmann and Waidner [13] as an analogy of the blind signature for fingerprinting. It 
uses a trusted third party, called the registration center, to identify buyers suspected of 
behaving illegally. Thus the merchant is not able to identify him without help of the 
registration center. Coin-based anonymous fingerprinting was introduced by 
Pfitzmann and Sadeghi [14]. Automatic identifying redistributor when fingerprinted 
contents are illegally redistributed without help of the registration eenter was 
introduced by Domingo [1]. Figure 1 is presented by classification of copyright 
marking system [19]. 

The remainder of the paper is organized into the following sections: Section 2 
describes the classification of fingerprinting techniques. Section 3 illustrates the 
requirement of digital fingerprinting system by contrary to digital watermarking 
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system. Section 4 overviews Domingo scheme [1]. Section 5 shows an efficient 
anonymous fingerprinting scheme of electronic information with improved automatic 
identification of redistributors. Section 6 describes security of proposed system. 
Conclusion is given in Section 7. 




Figure 1. Classification of Cop 5 right Marking System 



2, Classification of Digital Fingerprinting 

Fingerprinting schemes are technical means to discourage people from illegally 
redistributing the digital contents they have legally purchased. Fingerprinting schemes 
can be classified by five criterions like as follows: the objects to be fingerprinted, 
detection sensitivity, fingerprinting methods, generated fingerprints and disclosure of 
identification information to be fingerprinted. These five categories are not exclusive 
[18]. 



2.1 Object-based classification 

The nature of the objects is a basic criterion, since it may provide a customized way to 
fingerprint the object. Object-based classification is divided into two categories. One 
is physical fingerprinting(e.g. human fingerprints, iris patterns and so on). The other 
is digital fingerprinting. If an object to be fingerprinted is digital format type so that 
computer can process fingerprints, we call it digital fingerprinting. 
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2.2 Detection sensitivity based classification 

The sensitivity level of a fingerprinting scheme against illegal use is another criterion. 
Based on the detection sensitivity against violation, we classify fingerprinting into 
three categories. One is perfect fingerprinting. This is any alteration to the objects, 
which makes the fingerprint to be unrecognizable and also does the object unusable. 
Another is statistical fingerprinting. This is that if misused objects to examine the 
fingerprint generators was very much sufficiently given, we can gain any desired 
degree of confidence that they have correctly identified the compromised user. The 
other is threshold fingerprinting. Threshold fingerprinting is a hybrid type of the 
above two fingerprinting. It allows a certain level of illegal use, but it identifies the 
illegal copy when the threshold is reached. It is allowed to make copies of an object 
less than the threshold, and copies are not detected at all. 



2.3 Fingerprinting method-based classification 

Basic methods for fingerprinting such as recognition, deletion, addition and 
modification have also been used as another classification criterion. Recognition type 
fingerprinting is consists of recognizing and recording fingerprints that are already 
part of the object. Deletion type fingerprinting is that some legitimate portion of the 
original object is deleted. If some new portion is added to the object, it is of addition 
type fingerprinting. These additional parts can be sensible or meaningless. If a change 
to some portion of the object is made, it is called modification type fingerprinting. 



2.4 Fingerprint-based classification 

There are two categories in fingerprint-based classification. One is discrete 
fingerprinting. This is that a generated fingerprint has a finite value of discontinuous 
numbers. Many of digital fingerprints are included in this category. The other is 
continuous fingerprinting. This is that a generated fingerprint has a continuous value 
and essentially there is no limit to the number of possible values. Most physical 
fingerprint are of this type. 



2.5 Identification information-based classification 

There are two categories in identification information-based classification. One is 
symmetric fingerprinting, which digital content to be fingerprinted is known to buyer 
and merchant. Compared to the symmetric cryptography, encryption key is known to 
sender and receiver. The other is asymmetric fingerprinting. This is that digital 
content to be fingerprinted is only known to buyer. Compared to the asymmetric 
cryptography, secret key is only known to receiver. 
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3, Requirements of fingerprinting systems 

Requirements of fingerprinting system are similar to the requirements of digital 
watermarking system, but there are a few different requirements. That is, while digital 
watermarking authentieates ownership of digital content, digital fingerprinting 
authenticates ownership of the digital contents and also provide identification 
function [18]. 

Requirement 1. Collusion tolerance 

Attacker should not be able to find, generate, or delete the fingerprint by comparing 
the copies, even if they have access to a certain number of copies. In particular, the 
fingerprints must have a common intersection. 

Requirement 2. Object quality tolerance 

This is similar to perceptibility of the digital watermarking. The fingerprints have not 
to significantly decrease the usefulness of quality of digital contents. 

Requirement 3. Object manipulation tolerance 

If an attacker tampers the digital contents, the fingerprint should still be negotiable, 
unless there is so much noise that makes the digital contents useless. In particular, the 
fingerprint should tolerate lossy data compression. This is in order to trace illegal 
distributor after digital contents manipulated. 



4, Overview of previous scheme 

Domingo [1] described a construction for anonymous fingerprinting in which, on 
finding a fingerprinted copy, the merchant needs no help to identify the dishonest 
buyer. The role of the registration center is limited to buyer registration. In addition, 
the redistribution fraud can be proven to third parties. In the proposed scheme before 
Domingo suggested, on finding a fingerprinted copy, the merchant needs the help of a 
registration center to identify the redistributor. 



4.1 System set up 

Let /> be a large prime such that q = {p-\)H is also prime. Let G be a group of 
order p- \ , and let g be a generator of G such that computing discrete logarithms to 
the base g is difficult. Assume that both the buyer B and the registration center R 
have ElGalmal-like public-key pairs [15]. The buyer’s secret key is x„ and his public 
key is y, = g*- mod p . The registration center R uses its secret key to issue certificates 

that can be verified using R ’s public key. All public keys are assumed to be known 
and certified. 
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4.2 Buyer’s registration protocol 

(i) The registration center R chooses a random secret nonce e and sends 
y^=g‘- mod p to buyer B . 

(ii) Buyer B chooses secret random x, and x^ in such that x, + x = x, e and 

sends 5, = y/' mod p and = y'' mod p to registration center R . The buyer B 

convinces registration center R of zero-knowledge of possession of x, and x^ . The 
proof given in [16] for showing possession of discrete logarithms may be used here. 
The buyer B computes an ElGamal public key y^=g‘- mod p and sends it to 
registration center R . In fingerprinting protocol, 6’, will act as a pseudonym. 

(iii) The registration center R checks that = y^‘' mod p and y/' = mod p . The 

registration center R returns to B certificates Cert(y^\\SJ , Cert(yJ| and the 
value x^ . The certificates are linked by y, and state the correctness of 5, and . 
Cert(x) is generated using registration center’s secret key x, . 

By going through the above registration procedure several times, a buyer can obtain 
several different pseudonym pairs (6’,, SJ . 
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Figure 2. Buyer registration protocol 
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4.3 Fingerprinting protocol 

(i) The buyer B sends y^,y^,[S,,Cert{y^ ||5,)] and text to merchant M , where text is 
a string identifying the purchase. The buyer B computes an ElGamal signature sig 
on text with the secret key x^ . sig is not sent to the merchant M . 

(ii) The merchant M verifies the certificate on S, 

(iii) The buyer B and merchant M enter a secure two-party computation [17]. The 
merchant M ’s inputs are y^,y^,text and item, where item is the original information 
to be fingerprinted. The buyer B ’s inputs are x^,sig,S^ and Cert{y^\\S^) . The 
computations performed are: 

(a) ver, = Verify (text, sig, yj . The signature sig on text is verified using the public key 
y^ . The output ver, is a boolean variable only seen by the merchant M which is true 
if and only if the signature verification succeeds. 

(b) ver^= Verify {S^,Cert{y^\\Sf),x^,y^,yf) . First, the certificate on 5, is verified. 
Secondly, it checks that g*. =y^modp and yf =S^. Thirdly, it verifies that the value 
of y^ in the certificate on S', previously verified by the merchant M . The output ver^ 
is boolean variable only seen by the merchant M which is true if and only if the three 
aforementioned checks succeed. 

(c) item* = Fing{item, emb) . A classical fingerprinting algorithm is used to embed emb 
into the original information item , where 

emb = text || sig || y^ || x^ || y^ || || Cert(y^ || S^) 

The fingerprinted information item* is obtained as output and is only seen by the 
buyer B . In the above two-party computation, the merchant M obtains outputs first, 
and unless ver, and ver^ are both true, the merchant M does not obtain the output 
item * . 
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Figure 3. Buyer authentication process of fingerprinting protocol 
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Figure 4. Secure two-party computation of fingerprinting protocol 



4.4 Identification process 

On finding a redistributed copy, merchant M only extract emb . The extracted 
information contains the values specified by emb = text || sig || y, || x, || y, || || Cert(y^ 

II SJ and is combined by M with the purchase record [S„Cert{y^ || 5,)] to construct a 
redistribution proof. 

(i) The signature sig on the text is verified using anonymous public key y, . 

(ii) The value y^ links the certificates on S, and . Moreover y^ cannot be altered 
since it is part of the certificates. 

(iii) The value x, proves that the owner of the key y^ is the same as the owner of . 

This is so because, according to the registration protocol, the registration center i? 
only reveals log^y^ =x^ to 5 after 5 has provided y^ such that y/' ^S^modp . Now 
if the Diffie -Heilman key exchange is secure, the buyer B cannot produce a correct 
y^ without knowing log^ Ti = *2 = *^2 • 

(iv) Finally, to identify a buyer, the merchant M raises the public keys of buyers to 
X, until a public key y, is found such that =yj' moAp . The dishonest buyer B 
has been identified. Note that, since y^ is certified x^ cannot be forged by the 
merchant M to unjustly accuse a buyer. 



5, Proposed schemes 

In Domingo’s scheme, registration protocol is 4-pass. The buyer brings random nonce 
x^ from registration center in order to buyer’s identity is blinding. It is not efficient in 

electronic commerce since communication pass number is increased. In our proposed 
schemes, registration protocol is 2-pass. And his identification process needs many 
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exponential operations, since the merchant M raises the public keys in public key 
directory of buyers to x . It needs much computational quantity. In our proposed 
scheme, exponential operation is only 3+1 times. And our scheme’s identification is 
improved more than his scheme since many exponential computations replace 3+1 
time exponential operations. We now show efficient and improved schemes. 

Our scheme has three sub-protocol and one process. Those are registration protocol, 
fingerprinting protocol, identification process, and dispute protocol. System set up is 
the same as Domingo scheme. Our scheme is constructed following. 



5.1 Buyer’s registration protocol 

(i) The buyer B chooses secret random x, and x^ in such that x, • x^ = x^ e and 
sends t = g'-modp and encrypting x^ using registration center’s public key pk^ such 
as (Xj) to registration center R . t is anonymous public key of buyer’s B . The 
buyer B convinces registration center R of zero-knowledge of possession of x, . The 
proof given in [16] for showing possession of discrete logarithms may be used here. 
The buyer B ’s public key is an ElGamal public key y, . 

(ii) Registration center R first decrypt E„_(Xj) using its secret key sk^ and checks 

that mod p . If verification is successful, then the registration center R chooses 

a random secret nonce x^ e Z^ and computes T = g‘-- moAp . T will act as a 
pseudonym of the buyer B . The registration center R returns to B certificates Cert 
(t||xj, Cert{T) and the value x^ and state the correctness of t and T . 

By going through the above registration procedure several times, a buyer can obtain 
several different pseudonyms pairs (t, T) . 
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Figure 5. Proposed registration protocol 
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5.2 Fingerprinting protocol 

(i) The buyer B sends t, [T, Cert{T)] and text to M , where text is a string identifying 
the purchase. The buyer B computes an ElGamal signature sig on text with the 
random secret key x, . The signature sig is not sent to the merchant M . 

(ii) The merchant M verifies the certificate on T . 

(iii) The buyer B and merchant M perform a secure two-party computation [17]. 
The merchant M ’s input are T,t,text and item , where item is the original 
information to be fingerprinted. The buyer 5 ’s input are x^,sig,x^ and Cert(f||x,). 
The computations performed are: 

(a) ver, = Verify(text, sig, t) . The signature sig on text is verified using the anonymous 
public key t . The output ver, is a boolean variable only seen by the merchant M 
which is true if and only if the signature verification succeeds. 

(b) ver^ =Verify{t,Cert{t\\x,),x^,x^,T) . First, the certificate on t is verified. Secondly, 
it checks that T = g‘ -- mod p . The output ver, is a boolean variable only seen by the 
merchant M which is true if and only if the two aforementioned checks succeed. 

(c) item* = Fing(item, emb) . A classical fingerprinting algorithm is used to embed emb 
into the original information item , where 

emb = text || sig || 1 1| x^ || x, || T || Cert(t || xj 

The fingerprinted information item* is obtained as output and is only seen by the 
buyer B . In the above two-party computation, through the merchant M obtains 
outputs ver, and ver, first if they are both true, it does not obtain the output item * . 



5.3 Identification process 

On finding a redistributed copy, the merchant M extract emb . The extracted 
information contains the values specified by emb = text \\sig\\t\\x^ || x, || T || Certfi || x,) 
and is combined by the merchant M with the purchase record [T, Cert{T)] to 
construct a redistribution proof. 
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Figure 6. Buyer authentication process of proposed fingerprinting protocol 
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Figure 7. Proposed secure two-party computation of fingerprinting protocol 



(i) The signature sig on the text is verified using anonymous public key t . 

(ii) The value links the certificates on T and t . Moreover x^ cannot be altered, 
since it is part of the certificates. 

(iii) The value x^ proves that the owner of the key t is the same as the owner of T . 
This is so because, according to the registration protocol, the registration center R 
only reveals x, to B after B has provided T such that T = g‘-- moAp . Now if the 
Diffie-Hellman key exchange is secure, B cannot produce a correct T . 

(iv) Finally, to identify a buyer, the merchant M raises the public keys of buyers to 
Xj such that f-moAp = id . The merchant M searches for id = public -key in the 
public key directory. The dishonest buyer B has been identified. Note that, since T 
and t are certified, x, cannot be forged by M to unjustly accuse a buyer. 



5.4 Disputation protocol 

This protocol is optional. If the merchant M shows redistribution proof to third party, 
this protocol is performed. The merchant M sends proof (purchase record 
[T,Cert{T)] and emb ) to a judge. Judge verifies the proof First, certificate 
[T,Cert{T)] is verified by the registration center R ’s public key. Finally, checks 
t‘- mod p = y,- If above four verification is valid then owner of the public key is 
accused. If registration record is necessary to prove guilty of buyer then judge ask 
registration record to registration center. 



5.5 Comparison with previous scheme 

We now compare our scheme with previous it. According to the computation quantity 
point of view, we contrast above three protocols(registration, fingerprinting, 
identification). 
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Table 1. Comparison with previous scheme 



Protocol 


Compare object 


Previous scheme 


Our scheme 


Registration 


Exponential 

operation 


6 


7 


Multiplication 

operation 


1 


2 


Pass number 


4 


2 


Fingerprinting 


Exponential 

operation 


5 


4 


Multiplication 

operation 


5 


6 


Pass number 


6 


Same 


Identification 


Exponential 

operation 


3+N/2 

(Average) 


3+1 

(Any case) 


Multiplication 

operation 


2 


3 


Comparison 

operation 


N/2 


Same 



N : Number of public key in directory 



In registration protocol, multiplication operation of our scheme and exponential 
operation are on the increase 1 time, but pass number are on the decrease 4 to 2. 
Because pass numbers are 2-pass, our scheme is efficient in electronic commerce in 
digital contents. 

In fingerprinting protocol, multiplication operation of our scheme is on the increase 
1 time, but pass number is same, and exponential operation is on the decrease 5 to 4. 

In identification protocol, multiplication operation and comparison operation are 
same, but exponential operation is the decrease of 3+N/2 times to 3+1 times. And our 
scheme is more improved in automatic identification of redistributor. 



6, Security of proposed scheme 

We show security of our scheme. Security is not decrease than previous scheme, since 
our scheme is the same basis problem as previous one. 

Proposition 1 {Buyer’s Registration security) : Registration protocol provides buyer 
authentication without compromising the private key of the buyer. 

Proof : Registration center sees t, and zero-knowledge proofs. The latter leaks no 
information. If we don’t consider the zero-knowledge proofs, then registration center 
must need no knowledge of x, to find the value t' such that t'-- mod p = y, ■ If we 
consider the zero-knowledge proofs, then impersonator not knowing x^ can compute 
t, Xj such that t‘- mod p = y, ■ Hence the impersonator can compute the discrete 
logarithm x, . If impersonation is feasible, so is computing discrete logarithm 
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problem. In general, diserete logarithm problem is hard, so registration center does 
not make f such that f'. mod p = y, ■ 

Proposition 2 {Buyer anonymity) : An honest buyer who follows fingerprinting 
protocol will not be identified if computing discrete logarithm is hard and secure two- 
party computation is feasible. 

Proof : In fingerprinting protocol, merchant M knows t, [T, Cert{T)] and his outputs 
of a secure two-party computation that ver, and ver, . Finding y^ would require 
knowledge of x, . Flowever, if secure two-party computation is feasible, the only way 
for merchant to find x, is to compute log^ T using Cert{T) . As illustrated in security 
of registration protocol, discrete logarithm x, • x^ should be computed such as r = 
mod p . But, polynomial algorithm proving discrete logarithm problem does not 
exist, so attacker do not compute x, ■ x, . And, buyer anonymity is guaranteed. 



7. Conclusion 

We presented an efficient and improved identification asymmetric anonymous 
fingerprinting solution to redistribution of electronic information problem. Our 
construction is based on discrete logarithm problem. Security of our scheme is the 
same alike previous scheme, but efficiency is increased and computation quantity is 
decreased. According to the growth of electronic commerce, more efficient scheme is 
necessary. And we mention an application of our scheme to defend against the piracy 
of many software and multimedia contents. 
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Abstract. In this paper we investigate the notion of space efficient 
public-key infrastructure (PKI) directories. The area of PKI is relatively 
young and we do not know yet the long term implications of design de- 
cisions regarding PKI and its interface with applications. Our goal is to 
study mechanisms for networks and systems settings where the size of 
directories is a significant resource (due to space restrictions). 

Naturally, the tools we employ are cryptographic hashing techniques 
combined with the tradeoffs of public storage and computation. Our 
mechanisms are quite simple, easy to implement and thus practical, yet 
they are quite powerful in making the operation substantially less costly 
(mainly) storage-wise and in trading storage for computation. In the 
past, tree based mechanisms were considered extensively to improve the 
complexity of PKI directories. We show that hashing techniques provide 
various advantages as well. 



1 Introduction 

We are currently at the point, due to the enormous surge of Internet use, where 
large-scale Public Key Infrastructures (PKI) are being deployed. A primary us- 
age of PKI’s, which we deal with herein, is for representing and authenticating 
digital identities via the use of digital signatures. In this case the public keys 
available are used for signature verification. This setting enables authenticated 
channels (on top of which privacy can be established as well, via authenticated 
key exchange). It also enables undisputed “evident collection” in audits and 
logs, since unforgeable digital signatures authenticate the message and its orig- 
inator in a non-repudiated fashion. PKIs are employed in the area of “network 
security” to assure smooth, robust and secure operation of Intranets and inter- 
organization communication [KPS95] . PKIs will also be used in the growing area 
of distributed web-based electronic-commerce applications as the basic enabling 
technology [FB97]. 

Digital signature schemes can be used directly between users who know in 
advance each other’s public signature verification functions. However, for larger 
scale interactions one needs a PKI setting. A very basic PKI configuration con- 
sists of a Certificate Authority (CA) which publishes signature verification keys 
in a public file [Koh78] (also called “trusted directory” [M98] and “certificate 
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directory” in [MOV97]). Users may query the file to find bindings of public keys 
to their owners {pull model, see [MOV97] chapter 13.6.3). This basic setting is 
equivalent to a “white pages directory” (it does not deal with revocations of 
certificates and, in fact, the SET infrastructure for credit cards has adopted 
this model since credit cards are revoked at a different level of management, 
namely in the physical world [FL98]). A full fledge PKI, in turn, also includes a 
Repository (distributed directories) where Certificate Revocation Lists (CRT’s) 
include lists of revoked public keys, and users may query the CA and the repos- 
itory for status checks of keys (or may be notified of these status checks). CRT 
can also work (using Online/Real-time Certificate Status Checking Protocols 
([OCSP-ietf]) as being defined by the IETF) with the push model ([MOV97]), 
where each user possesses his/her certificate and presents it upon usage. 

PKI is a new area and we lack the long term understanding of its requirements 
regarding aspects like: space minimization issues, temporal (history-related) is- 
sues, validity determination issues, or application and device integration issues. 
We believe that we need to start thinking about such aspects as PKI becomes a 
reality. 

Here we deal with cases where storage required by signatures’ directories, 
such as Certification Authorities (CAs) and Revocation servers (CRTs), cashes 
of signature values, time stamping servers, etc. needs to be minimized. Space 
minimization is motivated by certain environments; in particular, it is signifi- 
cant for application layer scenarios involving small devices (mobiles and PDAs, 
set tops, network computers, etc.) and network and application layer security 
mechanisms involving logging large audit information (in firewalls and gateways) , 
and caching large server data bases, as well as for small business directories. 

In certain network contexts, digital signatures and certificates (which them- 
selves include digital signatures) may be stored in numerous places. A signature 
of a message may be used for an intermediate node to have irrefutable evidence 
of the fact that a certain packet passed through that node. This will create 
authenticated logs which are basic components of network security. In fact, fire- 
walls may store such logs. Another scenario where signatures are stored is when 
signatures and certificate directories are cached by end-nodes for e-commerce 
applications. In scenarios like the above, we realize that it is possible to have 
an extensive duplication of storage when the operations become complex and 
involve many parties (say, in a multicast scenario or a multi-party protocol) . We 
were therefore motivated to reduce the storage spent in the network on “sig- 
natures and certificate” while not losing their functional power (non-repudiated 
evidence collection) . We believe that our work can provide further options to de- 
signers of PKI and of architectures which exploit digital signatures (e.g., IETF, 
X509 etc.) when considering operation under certain (storage) constraints. 

A public key certificate can be approximately 256 bytes in length (by an 
underestimating conservative calculation that assumes a DSA signature of 320 
bits and a public key of 1024 bits as well as 88 bytes for alphanumeric informa- 
tion: dates and identities, not including the user ID). A system of 100 million 
users (e.g., the subscribers of the US postal service) would therefore require the 
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CA to have at least 25.6 gigabytes of storage for all of the certificates. In this 
paper we describe a method that allows the CA to store merely 10-15 bytes of 
information for each user (excluding the user’s name and information related 
to the file access method: hash or inverted list, etc.), instead of 256 bytes. This 
gives a 16 to 25-fold decrease in the storage requirements of the CA directory, 
reducing the size of the database to 1-1.5 gigabyte. (We note that the certificate 
size is conservative and assumed to be optimized, a typical X.509 certificate can 
contain much more information). 

What we present is the following: 

1. We first identify cases where one does not need to store the signature itself to 
fulfill its function; rather, a “signature digest” suffices. This simple demon- 
stration is merely our starting point, stimulating the possibility of reduced 
storage and tradeoffs. 

2. Sometimes, storage saving in directories is an outcome of using small size 
signature values. We therefore deal with further compressing signature size of 
existing methods. We show how to simply modify the DSA signature scheme 
to enable smaller sized signatures (we start with DSA which is already a small 
size signature; note that we do not consider multi-variate polynomial based 
schemes). This modified scheme preserves the algebraic properties of DSA, 
and at the same time yields signatures of size |< 7 | -I- K bits long, where K 
is the base-2 logarithm of the maximum value in the domain of a one-way 
hash function H used in computing the signature. The size of the signature 
is made comparable to Schnorr signatures while being based on the freely 
available DSA. This reduction is the starting point of our techniques and is 
based on hashing. 

3. We then show how, using hashing, any signature scheme can be represented 
in a public file in a much shorter record. This demonstrates the tradeoff of 
verification time and signature size vs. the public storage size. 

4. Then we describe how in certain PKI contexts, the techniques can save on 
storage and communication between users and CAs/ directories. Some of the 
saving implies saving of computation time as well. 

5. Using accumulated hashing, we show how to minimize the task of verifying 
membership of certificates in lists (e.g. CRT). 

Our work represent a few simple steps, perhaps some of which have been 
noticed earlier, yet it is effective and practical; it can be readily employed in 
various setting which are different than the X.509 standard one (we believe that 
non-standard solution will be needed and applicable in many crucial cases). 



Related Work 

The notion of a digital signature was put forth in [DH76] and the first realization 
of them was the RSA scheme [RSA78] which is based on the problem of factoring. 
Since then many digital signature schemes have been proposed with various 
properties. The first public signature scheme making use of the discrete log 
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problem was ElGamal [E1G85]. Schnorr, then, proposed a signature scheme based 
on the problem of computing discrete logs in a subgroup of order q, where q\p—l, 
with q and p prime. With jgl = 140 bits, a Schnorr signature is about 212 
bits long, which is much smaller than an RSA signature. The Digital Signature 
Algorithm due to Kravitz [DSS91] is a variant of ElGamal and Schnorr. Like 
Schnorr, DSA makes use of the problem of computing discrete logs in a subgroup. 
A DSA signature is 320 bits in length. 

Many issues regarding PKl and trust have been dealt with in the last few 
years. Notions of management of public key directories making them current 
and optimizing the sizes of proofs to users of a certificate status are dealt with 
in [FL98,M98,K98,R98,M96,NN98,AL098]. We are not aware of attempts to 
optimize storage of directories in models where we need to save on it. The issue 
of compressing and optimizing the usage of signatures in networking scenario 
was dealt with, in the case of bulk (flows of messages and multicasts) e.g., in 
[WL98] , our setting is different (we deal with individual signature as a block) and 
it specializes to the PKl setting, though we show how to employ their method 
as well in combination with our techniques. 

2 Minimizing Signature Size 

We skip the definition of a signature scheme in this version, a definition is given 
e.g., [MOV97] Ghapter 11. A signature scheme involves a signing function which 
is private and a corresponding public verification function associated with the 
user owning the signing function. The verification is efficient while forging a 
signature on unsigned messages is hard and considered impossible for all practical 
purposes. Given a message m the signing algorithm generates a{m) which is the 
signature. If m is recoverable from the signature, the scheme is called a “message 
recovery scheme”. The typical signature methods are RSA [RSA78] (based on 
the hardness of factoring) and DSA [DSS91] (based on the hardness of extracting 
discrete logarithm). 



2.1 A Very Simple Starting Point: Signature Digest 

In network applications, often the signature is kept for the sake of having a 
“record.” This is typical in a scenario where an intermediary (third party which 
is neither the sender nor the receiver (s)) is performing an “audit function”. In 
many cases, the signature and the message are kept in storage. This may be 
unnecessary. In many cases, when a dispute occurs over the authenticity of mes- 
sages, either the receiver or the sender(s) present the message and the signature. 
In this case the intermediary need only store a digest of the signature and the 
message, or just of the signature in case it is a “message recovery signature.” 
Given a signature cr(m) and a message m, the stored value can be H{a{m),m) 
where 77 is a collision-intractable hash function (a hash where collision find- 
ing is hard). In practice, the SHA-1 message digest function [SHAl] suffices (or 
SHA-2) . When presented with the signature and the message the third party can 
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recompute the digest and look and compare with his/her storage. (The storage 
is assumed to be organized so that access based on the hash value is efficient; 
such storage schemes are common). 

2.2 On Smaller Signatures: Size Efficient DSA Variant 

The above signature storage technique is a standard one, yet one may want to 
“digest” the signature at its creation so that the digest be sufficient for valida- 
tion of the signature by the verification algorithm. Such digest methods were 
originated by Schnorr, and used in a variant in the DSA signature suggested 
by NIST for general use. One advantage of the former is that it may be shorter 
while the later is free to be used by the public. Here we combine both advantages 
into a third variant which we present in this subsection. This is yet a bit less 
simpler starting point to our needs. 

First, let us review the Digital Signature Algorithm. Let p be a large prime, 
such that q\p — 1, where g is a 160 bit prime. Let g be an element with order 
q mod p. The private signing key of a user is x, where x Zq. The public 
verification key of the user is y = mod p. To sign a message m, the user 
computes the following: 

1. k Gr Zq 

2. r = {g^ mod p) mod q 

3. s = k~^{H{m) + xr) mod q 

4. output the signature (r, s) 

The function H used here is SHA. To verify the signatures authenticity, the 
verifier checks that r = mod p) mod q. Note that |g| = 160 bits, 

hence the signature is 320 bits long. The question we ask is, “why is g^ mod p 
reduced modulo g?” . 

To answer this, let us consider DSA and it’s origins. After all, DSA was 
designed by the NSA, and many questioned whether or not it has a back-door. 
It may seem that by reducing g^ mod p by q we conceal g^ mod p. But, this is not 
the case. It was shown in e.g. [NR94,YY97] that DSA can and in fact does give 
away g^ (mod p), and can be (ab)used for key exchange and encryption. This 
attack exploits the fact that g^ mod p is in fact readily computable from (r, s). 
To see this, note that y”'* mod p is mod p for valid signatures. 

Thus, reducing mod q does not hide g^ mod p at all. What then is the purpose 
of reducing modulo q to derive r? To help answer this, let us review the Schnorr 
digital signature scheme [Sc89] which is another small space signature which is 
covered by a patent [Sc89-p] . 

Let s be the private signing key chosen randomly mod q. Let v = g~‘’ mod p 
be the corresponding public key. Here p, q, and g are the same as in DSA. To 
sign m the signer does the following: 

1. k Gr Zq 

2. t = g^ mod p 
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3. e = H{m, t) 

4. u = k + se mod q 

5. output the signature (e,u) 

To verify the signature, the verifier computes z = mod p and checks 
that e = H{m,z). Note that in Schnorr, mod p is hashed. Indeed this is also 
possible in DSA. So, the following is our modified DSA signing algorithm: 

1. k Gr Zq 

2. r = H{g^ mod p) 

3. s = k~^{H{m) + xr) mod q 

4. output the signature (r, s) 

To verify the signature, the verifier checks that r = H y’’® mod p) . 
So, the only change made to DSA is that we hash g^ mod p rather than reduce 
it modulo q. Hence, by making DSA more like Schnorr, we end up with a faster, 
more space efficient signature scheme. It is faster since H can be SHA, MD5, 
etc., and hashing is faster than modular reduction. It is more efficient since the 
range of H can be narrower than \Zq\ for any 160 bit prime q nowadays, and 
in the future if we use larger p’s and q’s (say 2048 bit p’s and 320 bit q’s), the 
range of H may become even a larger improvement compared with the size of q 

Speculating Remark: Why then was DSA not defined this way? It does not 
seem that it was designed this way for the purposes of implementing a back- 
door in DSA. It was perhaps designed this way to look less like Schnorr to avoid 
patent conflicts. However, this is pure speculation. 

2.3 Size Efficient Generic Public Verification Key 

The above can be viewed as a method which hashes the random key used within 
an individual signature operation. We now show that such hashing can be applied 
to the permanent verification key as well. 

Assume now the setting where a user publishes his public key (in his home 
page or in a white page directory or in a PGP system [Zim92]). Let a signature 
scheme in use have a private signing key sjj for user U, and its corresponding 
public verification key vjj. In a typical public key system vjj gets published. We 
modify the system to publish hu = H{vjj) which is made as U’s public key. Here 
iL is a one-way (collision intractable) hash function with, for instance, a range 
of {0, 1}®°; in practice the first 10 (or 15) bytes of SHA can be used (public keys 
are fixed size strings which are very structured (we may add an error correction 
code field to the signature value to add redundancy), and collisions will be still 
hard to find even when using a 10-15 byte hash function size). 

To sign, the user signs using sjj and also sends an appendix vu. The verifier, 
in turn, first verifies that the hash of the appendix matches the publicly available 
hu, and then uses the appendix to verify the signature itself. 

The above simple hashing idea demonstrates a tradeoff between public key 
size and the size and time of signature verification (but the time does not grow 
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by more than twice). The fact that the hash function is collision intractable 
assures the unique binding of vjj and the publicly available hu^ . Of course, a 
tradeoff between space and security (collision probability) is immediate. 

We note that the above method is a hybrid which combines the “pull model” 
(where the verifier has access to the signature scheme) with the “push model” 
(where the signature scheme is presented to the verifier). The appendix may 
include the certification chain, which may be checked up to the required point 
by the receiver. 



3 Space Efficient PKI Directories 

Next we exploit the techniques above and directory organization methods to 
reduce space and processing time in the context of PKI. The methods described 
here are an alternative to organize PKI when storage is the limiting constraint. 



3.1 Efficient White- Pages Based Signature PKI 

In this subsection we will explain how to implement a size efficient public file 
which is maintained by a CA (CA directory), in a basic digital signature public 
key system (a white pages system). Note that a white page design is not what we 
suggest here, but it is a construction known in the literature (where certificates 
are backed up or maintained centrally). 

In a typical public key system, the CA signs public keys and issues the 
resulting digital certificates to users; in each signature verification a CA signature 
needs to be checked as well. In our model, we have a white pages (trusted file) and 
we can save space of certificates. In addition we will save on signature verification 
time. 

In this subsection, let Cjj denote the (lean) digital certificate of user U which 
is a usual certificate but without the CA signature on the user’s public key. 
Recall that the usual certificate information typically includes: user (entity) ID, 
validity period, serial number, verification- key description, issuer ID, information 
about the entity, information about the key, information about the usage. The 
CA typically includes this information and its own signature when making a 
certificate. In our scenario, the CA makes h/j = H{Cjj) available in the public 
file. 

To sign a message, U signs the message using his private key as in a nor- 
mal public key system. However, Cu (the lean certificate) is included with the 
message and the signature. The ‘new signature’ then, is the signature from the 
relevant digital signature scheme plus the lean digital certificate of the user. 

^ assuming 2'^® non-maliciously chosen certificate for all US citizens, there are 2®® pairs 
while 2®° (or 2^'^®) random hash values are possible (assuming SHA is a good ideal 
random hash, as done in many analyses) which makes the probability of a “collision 
pair” very small. 
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A verifier verifies a signature by first hashing the lean certificate and compar- 
ing to hu from the white pages file. This is accomplished by taking U’s public 
key from Cu and performing signature verification. 

This system provides an enormous storage savings for the CA. In a typical 
public key system, the CA stores Cu for each user, which is approximately 256 
bytes of information (under a very conservative estimate). By having the CA 
store H{Cu), the CA only stores 10 bytes per user, giving the CA a 25-fold 
storage savings. Also, we take advantage of the trusted white pages file to save 
time in verifying the certificate (essentially cutting the verification time by half). 

Note that the white pages file itself can be signed (and verified by a user) 
once by means of a signature on it by a CA when the file is published (loaded 
by a user). In these case, the CA’s key is used minimally, using the white pages 
model which is relatively static. 

In fact, various blocks can be signed separately, to allow monotonic increase 
of the directory, and speeding up the verification of the integrity of the directory 
(which can be done block by block whenever the block is in use). Aggregated 
“groups of pages” can be signed to balance the checking of validity and size. 
In fact a method like the one by Wong and Lam [WL98] can be exploited here 
for signing bulk data efficiently. (We will explain this method which combines 
message authentication and signature to achieve more efficient signature, in more 
details in the next version). 

We also note that the hash values can be used for file organization and access 
methods as well. Further, another organization method allows the file server to 
keep the certificate value off-line (to be retrieved if needed) while the on-line 
organization is as above. The off-line “full- fledge certificate” can be used rarely 
depending on the application, in which case it can be retrieved from secondary 
storage (server) rather than using the available data (or cached data). 



3.2 Size Efficient CRL Repositories in PKI’s 

Now we show the functionality of our technique in the context of a Public Key 
Infrastructure where the CA signs certificates and distribute to users (what is 
known as the push model originally adopted by X.509). 

In this setting it is a “repositories’ job” to maintain a list of revoked public 
keys for each user of a Public Key Infrastructure (CRTs). In the event that a 
signing private key is lost or stolen, the corresponding public key and/or digital 
certificate is stored by the repository for that user. From then on, when user A 
verifies a signature of user B, user A has the option of checking the repository 
to see if the version of user B’s public key that A has, has been revoked. If the 
public key had been revoked, after the query, user A will be informed of this, 
and the authenticity of the message could then be brought into question. This 
capability is important because a malicious user could use the lost and or stolen 
private key and try to sign a message, impersonating user B. The attacker may 
be aware that user A is storing the corresponding outdated public key, and may 
hope that user A doesn’t query the repository. 
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A repository typically stores either the public key or the digital certificate of 
lost or stolen public keys. However, hashing can be used to greatly reduce the 
repository’s storage requirements. By having the repository store vu = H{Cu) 
instead of the certificate Cu, the repository allocates merely, say, 10 bytes for 
each lost or stolen public key. A user checks to see if a certificate Cu has been 
revoked by checking that vu = H{Cu) is in a CRL. This saves space and com- 
munication (queries can be based on the hash value in the “pull model” , and list 
distribution in the “push model” can also be based on hash values which can 
help in accessing certificates). These methods are also applicable to PKI’s that 
are used for encryption and decryption (rather than signing and verification). 

3.3 Proving Revocation/ Active Certificate 

Here the tradeoff will be between the server work and the user’s time and re- 
sources of checking. 

If a user needs a “proof of revocation” (does not trust the directory), one can 
hash the certificate hash values using a accumulated hash (as given in [BdM94, 
Ny96]), and the CA signs this value. If the list is: {Li, L 2 , ■■■, Lm}, H{L) = 

The CA signs H{L), to prove revocation the CA 
present to the user three values the hash of the values till Li, Li and hash of the 
values larger than Li. From this three values the user can reconstruct (for each 
i) the value H{L) and can check the signature (if this was not done before). The 
list is harder to maintain by the CA (keeping hashes of sublists will help). As 
the list changes H{L) changes, periodically. Or a list of lists is maintained. 

This method implies short proofs to users of the fact that a key is a member 
of a list (CRL). the user’s task now is independent of the size of the CRL (which 
may be attractive for small energy devices). 

Further trust may be implied by having distributed directories (in which 
case storing hashed values is important since there is no need to replicate longer 
values when a shorter digest can be employed and is sufficient). 

In [GGMOO], it was suggested to have am active certificate list, similar to a 
revocation list where the CA signs the root, and a CA can send the path to the 
root in the CRL. This can also be replaced by an accumulated hash as above. 

4 Conclusions 

Scenarios for minimizing the signature storage and representation have been 
investigated. These were motivated by the needs of certain systems and commu- 
nication settings. In some settings directory size issues are crucial and standards 
may not provide a feasible solution. 

We discussed various space minimization issues and tradeoffs between space 
and other resources in various potential PKI components. 

A modification of DSA was presented that allowed for variable sized DSA- 
like signatures. Not only does this scheme provide for smaller signatures with 
nearly the same level of security (for use with the directories), but the signing 
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algorithm is also faster. We showed how hashing can be used to decrease the 
storage requirements of public verification keys by making the signature larger. 
This is used to implement a trusted public key file in a PKI which also decreases 
the verification time. The same method was then applied to reducing the stor- 
age/communication requirements of a directory or CRL repository in various 
PKI settings. 

We believe that determining the right components of PKI is an open issues 
which will evolve as we gain experience with PKI and its applications. The 
interaction between cryptographic issues and traditional data base issues in the 
case of PKI directories presents potentially interesting open issues. 
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Abstract. Security Evaluation System is a system that evaluates the security of 
the entire enterprise network domain consists of various components and that 
supports a security manager or a security management system in making deci- 
sions about security management of the enterprise network based on the evalua- 
tion. It helps the security manager or the security management system to make a 
decision about how to change the configuration of the network to prevent the 
attack due to the security vulnerabilities of the network. Security Evaluation 
System checks the “current status” of the network, predicts the possible intru- 
sion and supports decision-making about security management to prevent the 
intrusion in advance. In this paper we analyze the requirements of the Security 
Evaluation System that automates the security evaluation of the enterprise net- 
work consists of various components and that supports decision-making about 
security management to prevent the intrusion, and we propose a design for it 
which satisfies the requirements. 



1 Introduction 

1.1 Overview of the Security Evaluation System 

Recent explosive increase in the use of the network makes the security of the network 
that consists of various components much more important than that of individual 
hosts. In addition to the Firewall and the Intrusion Detection System [II], we need a 
tool that can check the current security status of the enterprise network. To maintain 
the security of the enterprise network, the tool must have the capability of evaluating 
the security of the entire enterprise network and the capability of decision support for 
the network security management to prevent the intrusion caused by the security vul- 
nerabilities of the network. 

Security evaluation requires various kinds of test and analysis, and it is hard for a 
network security manager to perform all the tests and analyses of the large enterprise 
network for himself. Therefore, a tool is needed which automates the tests and analy- 
ses. The tool must have the capability of evaluating the security of the entire enterprise 
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network domain as well as the capability of the security analysis of the individual host 
or subnetwork. It must support a security manager or a security management system in 
making decisions about security management of the network based on the security 
evaluation result to prevent intrusion caused by the vulnerability of the network. 

Security Evaluation System is a system that evaluates the security of the entire en- 
terprise network domain consists of various components using automated analysis 
tools and that supports decision-making about security management based on the 
evaluation to prevent the intrusion in advance. It helps security managers or security 
management systems to make a decision about how to change the configuration of the 
network to prevent the attack due to the security vulnerabilities of the network. The 
Intrusion Detection System [II] detects an attack when the intrusion really happens, 
but Security Evaluation System checks the “current security status” of the network, 
predicts the possible intrusion and helps security managers to prevent the intrusion in 
advance. 



1.2 Necessity of the Research on the Security Evaluation System 

So far there is no formal and standardized technology for Security Evaluation System, 
and vulnerability assessment tools developed by individual software companies are 
used for security evaluation of the network instead. 

Hacking tools such as sscan, SATAN [8], SAINT, ISS are widely used for vulner- 
ability assessment but the individual tools are not integrated and it is hard to get inte- 
grated security evaluation report from the vulnerabilities found by each tool. The tools 
have to be updated whenever a new vulnerability is found and a new version of the 
tool is available reflecting the vulnerability. They just provide a report on several well- 
known vulnerabilities from a network scanning. 

Commercial vulnerability assessment tools such as AXENT NetRecon [3], ISS 
Internet Scanner, System Scanner, Database Scanner [4] and RSA Kane Security 
Analyst [5] are used for security evaluation but these tools just provide several host or 
network scanning functionality. 

Security Evaluation System has to be evolved to provide automatic update of the 
search capability for the new vulnerability check, vulnerability analysis using hacking 
simulation, and analysis of the possible intrusion due to a specific vulnerability. It also 
has to provide integration with the Security Policy Server, vulnerability recovery 
functionality integrated with the Security Management System, decision support for 
the security management and scalability for the large enterprise network. But so far 
the research on the Security Evaluation System is not enough, and there is no tool that 
supports all the functions described above. Therefore, further research and develop- 
ment are needed for the progress of the Security Evaluation System Technology. 

In this paper we analyze the existing tools for security evaluation. Then we analyze 
the requirements of the Security Evaluation System that automates the security 
evaluation of the enterprise network and that supports decision-making about security 
management to prevent the intrusion, and we propose a design for it that satisfies the 
requirements. 
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2 Vulnerability Assessment Tools 

Security evaluation requires various kinds of test and analysis, and it is hard for a 
network security manager to perform all the tests and analyses of the large enterprise 
network for himself. Therefore, a vulnerability assessment tool is needed which auto- 
mates the tests and analyses. Network-based scanners and host-based scanners [1] are 
the typical vulnerability assessment tools. 



2.1 Network-Based Scanner 

A network-based scanner performs quick, detailed analyses of an enterprise’s critical 
network and system infrastructure from the perspective of an external or internal in- 
truder trying to use the network to break into systems. Features of the network-based 
scanners are as follows [1]. 



■ A network-based scanner analyzes network-based devices on the network and 
quickly provides repair reports to allow quick corrective action. 

■ It can be set up and used quickly because it requires no host software to be in- 
stalled on the system being scanned. 

■ It assesses network-based vulnerabilities by replicating techniques that intruders 
use to exploit remote systems over the network. 

■ Many network-based vulnerabilities are more efficiently investigated over the 
network. 

■ It can test vulnerabilities of critical network devices that don’t support host- 
scanning software, including routers, switches and printers. 



2.2 Host-Based Scanner 

Host-based scanning’s strengths lie in direct access to low-level details of a host’s 
operating system, specific services, and configuration details. While a network-based 
scanner emulates the perspective that a network-based intruder would have, a host- 
based scanner can view a system from the security perspective of a user who has a 
local account on the system [ 1 , 2] . 

■ A host-based scanner can identify the risky user activities. 

■ It detects signs that an intruder has already infiltrated a system. It also detects 
any unauthorized changes in critical system files. 

■ It detects signs that an intruder is still active on a system, including locating 
“sniffer [8]” programs. 

■ It detects well-known hacker back-door programs such as “Back Orifice” and 
local host services vulnerable to “local buffer overflow”. 

■ It is ideal for performing resource-intensive file system checks, which are im- 
practical with network-based scanners. 
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2.3 Vulnerability Assessment Tools in Use 

AXENT NetRecon [3], ISS Internet Scanner, System Scanner, Database Scanner [4] 
and RSA Kane Security Analyst [5] are famous vulnerability assessment tools in use. 
They provide network-based and host-based scanning functionality. 

They provide Graphic User Interface and some product can analyze the communi- 
cation devices such as router. Some product suggests corrective actions when the 
vulnerability is found. Table 1 summarizes the major vulnerability assessment tools 
[6]. 



Table 1. Vulnerability Assessment Tools 
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3 Design of the Security Evaluation System 

In this section, we analyze the requirements of the scalable Security Evaluation Sys- 
tem that automates the security evaluation of the enterprise network consists of vari- 
ous components, and that supports decision-making about security management of the 
network to prevent the intrusion caused by the vulnerability of the network. Then we 
propose a design for it which satisfies the requirements. 
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3.1 Requirements of the Security Evaluation System 

The requirements of the Security Evaluation System are as follows. 



■ It must support the security evaluation of the enterprise network consists of vari- 
ous components and it must be scalable. Its performance should not be decreased 
even if the size of the network grows. 

■ It must be able to analyze the vulnerability of a host. For example, it must be 
able to check system configuration error, Trojan horses and system file integrity. 
It also must be able to trace signs of the hacker’s intrusion. 

■ It must perform the vulnerability checks and security analyses of each subnet. 
For example, it must be able to analyze the vulnerability of network services. 

■ It must provide the function of generating the security evaluation reports. It also 
must support a security manager or a security management system in making de- 
cisions about security management of the network based on the security evalua- 
tion result to prevent intrusion caused by the vulnerability of the network. 

■ It must analyze whether the various components of the enterprise network obey 
the security policy of the network domain or not using the Security Policy 
Server. 

■ It must support the remote and central administration. It must provide the central 
view of the security status of the entire network. 

■ It must provide the vulnerability recovery functionality using Security Manage- 
ment System. 

■ It must be able to get new vulnerability information from directory services fre- 
quently and apply it to the security evaluation. 



3.2 Architecture of the Security Evaluation System 

Fig. 1 illustrates the architecture of the Security Evaluation System that satisfies the 
requirements described above. The major components of the Security Evaluation Sys- 
tem are Agents, Subnet Analyzers, a Domain Analyzer, a Security Evaluation Rule 
Manager and a Manager Tool. 

3.2.1 Agent 

Agents are executed in each host and they analyze the security of the host in detail. 
They access the low-level detail of a host’s operating system, specific services, and 
configuration details. Each Agent analyzes the system configuration error, signs of the 
hacker’s intrusion, system file integrity and the existence of the Trojan horses or vi- 
ruses. Each Agent then generates a security evaluation report of the host and sends it 
to the Subnet Analyzer that is in the same subnet. 
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3.2.2 Subnet Analyzer 

Subnet Analyzers evaluate the security of each subnet from the viewpoint of network 
user. Each Subnet Analyzer analyzes vulnerability of the network services and dae- 
mons. It analyzes the host security evaluation reports from each Agent in the subnet 
and finds out vulnerability which can be only checked with the information of several 
hosts that are members of the distributed environments. A subnet security evaluation 
report is sent to the Domain Analyer. 



* \ 



Certificate ; 
Issuance to^ 
Each Servers 



Security 

Policy 



^ecurit^ 

(Management 

VSysten^ 

Securty 
Management' 
Decision 

Securipf 

'' ^ ^valuation 

Manager 
Tool 



\ , Domain Analyzff 



JCorfi^ation, 




sknittjAliif 

Deddcn 

_Rq^DB ^ 

Store / Searchth^ 
Securty Mgmt 
Decision 

Security Mgmt 
Decision cJthe 
Netwoilc Dcmai] 



Updates 

fortheNeiV 

Vulnerabiities 




^ore/ Search 
the Securty 
E\>duation 
Rdes 




Fig. 1. Architecture of the Security Evaluation System 



3.2.3 Domain Analyzer. 

Domain Analyzer evaluates the security of the entire enterprise network domain based 
on the security evaluation results from each Agents and Subnet Analyzers. It analyzes 
the reports from each analyzer and finds out correlation between the vulnerability. It 












252 



J. S. Lee, S. C. Kim, and S. W. Sohn 



predicts the possible intrusion due to the vulnerability analyzed so far. Then it makes 
decisions about security management to prevent the possible intrusion and report them 
to the security manager. It also sends the notification mail to the system manager of 
the host that has the security weaknesses. It notifies the decisions about security man- 
agement to the Security Management System or to the Intrusion Detection System 
[II] and helps them to prevent the possible intrusion. 

3.2.4 Security Evaluation Rule Manager 

Security Evaluation Rule Manager helps the security manager to generate and manage 
the evaluation rules that are needed for security evaluation. Security evaluation rules 
consist of vulnerability information related to a specific host or network resource, 
possible intrusion due to the vulnerability, and countermeasures for it. Security 
evaluation of Agents, Subnet Analyzers, and Domain Analyzer is based on the secu- 
rity evaluation rules that are managed by the Rule Manager. Rule Manager gets new 
vulnerability information from the external trusted directory service frequently and 
updates the security evaluation rule based on it. Therefore Security Evaluation System 
can cope with the new vulnerability rapidly. 

3.2.5 Manager Tool 

Manager Tool helps the administration of Security Evaluation System. A security 
manager can configure each component of Security Evaluation System remotely. He 
can centrally manage the security evaluation rules and security policy. Manager Tool 
helps the security manager to view the evaluation reports in various formats. 

3.2.6 Etc 

Each component of Security Evaluation System such as Agents, Subnet Analyzers, 
Domain Analyzer, Manager Tool and Security Evaluation Rule Manager encrypts the 
data that are confidential such as security evaluation reports. Eor the authentication 
between the components, they use the certificates issued by the trusted CA (Certificate 
Authority). 

Security Policy Server is the server that manages the security policy of the network 
domain. Security Evaluation System always references the security policy when it 
performs the security evaluation and checks whether the network resource obeys the 
security policy or not. It makes the components of the enterprise network follow the 
consistent security policy. 

3.2.7 Features of the System Design 

Security Evaluation System proposed in this paper evaluates the current security status 
of the enterprise network and supports decision-making about security management to 
remove the vulnerability of the network and to prevent the possible intrusion. Security 
evaluation performed by each analyzer module (Agent, Subnet Analyzer, Domain 
Analyzer) is based on the security evaluation rules that are managed by Rule Manager. 
Rule Manager gets the new vulnerability information from the external trusted direc- 
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tory service frequently and updates the security evaluation rule based on it. Security 
Evaluation System can cope with the new vulnerability rapidly by this feature. 

In this design, Security analysis processes are distributed over several analyzers. 
That is, an Agent analyzes the security of a host, a Subnet Analyzer analyzes the secu- 
rity of a subnet and Domain Analyzer analyzes the results from all analyzers. There- 
fore the processing load of each analyzer is relatively small. If the number of hosts in 
the network domain increase, overall evaluation performance will be decreased. But 
the overall performance can be preserved from decrease by adding new Agents and 
Subnet Analyzers to the network domain, since the processing load can be distributed 
over new analyzers. That is, Security Evaluation System is scalable. Security Evalua- 
tion System requires relatively small network traffic because not all the host or net- 
work information is sent to the Domain Analyzer. Agents or Subnet Analyzers gather 
all the information about hosts or subnet, analyzes the hosts or subnet information, and 
they send the security evaluation reports to the Domain Analyzer. The reports contain 
only information related to the security evaluation result of the host or the subnet. It 
also greatly reduces the amount of data that the Domain Analyzer has to analyze. 
Therefore, Security Evaluation System is suitable for large enterprise network. 

Each analyzer module in Security Evaluation System references the security policy 
of the enterprise network domain and checks whether the components of the network 
obey the security policy whenever the analyzer performs the security evaluation of the 
component. It makes the components of the enterprise network follow the consistent 
security policy. 



3.3 Detail Architecture and Processing Flow of the Security Evaluation System 

An outline of the processing flow of Security Evaluation System is as follows. Update 
of the Security Evaluation Rule DB is performed frequently independent of each ana- 
lyzer in the Security Evaluation System. 

Security Evaluation of each host by Agents 
Security Evaluation of each subnet by Subnet Analyzers 
Decision-making about Security Management of the network 
by Domain Analyzer 

An outline of the processing flow of each analyzer module (Agent, Subnet Ana- 
lyzer, Domain Analyzer) is as follows. 

Analysis of the host or network resources based on Security 
Evaluation Rules and Security Policy 
Generation of the Security Evaluation Result Report 
Notification of the Evaluation Result to the higher analyzer module 
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Both the security evaluation rule and the security policy have to be referenced dur- 
ing security evaluation. To do this, the analyzer selects the rules and policy that are 
suitable to the target hosts or network environment from the Security Evaluation Rule 
DB and the Security Policy DB, and stores them in Customized Rule DB. Customized 
rules in the DB are referenced for the security evaluation. If the customized rules 
already exist and if they are made for the same host or network environment from the 
same security evaluation rules and security policy, the analyzer omits the Rule DB and 
Policy DB access and uses the customized rules that already exist. This reduces the 
DB access and search time. If there is any conflict between security evaluation rules 
and security policy, security policy has the preference. 

Detailed architecture and processing flow of the Security Evaluation System are as 
follows. 

3.3.1 Generation of the Security Evaluation Rules by the Security Evaluation 
Rule Manager 

Eig. 2 illustrates the architecture of the Security Evaluation Rule Manager. 
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Fig. 2. Architecture of the Security Evaluation Rule Manager 



Security Evaluation Rule Manager gets the new vulnerability information from the 
external trusted directory service frequently. Security Evaluation Rule Generator gen- 
erates the security evaluation rules from the vulnerability information by security 
manager’s command from the Manager Tool, and stores them in the Security Evalua- 
tion Rule DB. The security manager generates, updates and deletes the security 
evaluation rules. 

Security Evaluation Rule Manager helps the security manager to generate and man- 
age the evaluation rules. Security evaluation of Agents, Subnet Analyzers, and Do- 
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main Analyzer is based on the security evaluation rules. If an analyzer request the 
security evaluation rule. Rule Manager searches for the rule in the Rule DB and return 
it using the analyzer interface. 

3.3.2 Security Evaluation of a Host Using an Agent 

Fig. 3 illustrates the architecture of the Agent. 
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Fig. 3. Architecture of the Agent 

Rule Customizer of each Agent selects the rules and policy that are suitable to the 
target host from the Security Evaluation Rule Manager and the Security Policy Server, 
and stores them in the Customized Rule DB. If the Customized Host Rules already 
exist and if they are made for the same host environment from the same security 
evaluation rules and Security Policy, the Agent omits the Rule DB and Policy DB 
access and uses the customized rules that already exist. 

Host Check Engine evaluates the security of the target host based on the Custom- 
ized Host Rules. It may exchange the host security information with other Agents to 
find out the vulnerability that can be checked using the information of other hosts. 

It checks the vulnerability of the target host using Host Hacking Simulator. In this 
case, Host Hacking Simulator has to leave a message to the host to make it distinguish 
the test from real hacking. Host Check Engine accesses the low-level detail of the 
target host’s operating system, specific services, and configuration details. It analyzes 
the system configuration error, bug patch status, signs of the hacker’s intrusion, sys- 
tem file integrity and the existence of the Trojan horses, viruses or sniffers. It also 
checks whether the host obeys the security policy or not. 
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Agent Report Manager generates the host security evaluation result report that de- 
scribes the information related to the vulnerability based on the security evaluation 
result from Host Check Engine. It stores the report in the Agent Report DB and sends 
the report to the Subnet Analyzer that is in the same subnet. 

3.3.3 Security Evaluation of a Subnet Using a Subnet Analyzer 

Fig. 4 illustrates the architecture of the Subnet Analyzer. 
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Fig. 4. Architecture of the Subnet Analyzer 

If the Customized Subnet Rule does not exist in the Customized Rule DB, Rule 
Customizer of the Subnet Analyzer selects the rules and policy that are suitable to the 
target subnet from the Security Evaluation Rule Manager and the Security Policy 
Server. It stores them in the Customized Subnet Rule DB. 

Subnet Check Engine evaluates the security of the target subnet based on the Cus- 
tomized Subnet Rules. It may exchange the subnet security information with other 
Subnet Analyzers to find out the vulnerability that can be checked using the informa- 
tion of other subnet. 

It checks the vulnerability of the remote hosts using Network Hacking Simulator. In 
this case. Network Hacking Simulator has to leave a message to the remote host to 
make them distinguish the test from real hacking. 

Subnet Check Engine evaluates the security of a subnet from the viewpoint of a 
network user. Each Subnet Analyzer analyzes vulnerability of the network services 
and daemons. It analyzes the host security evaluation reports from each Agent in the 














A Design of the Security Evaluation System 



257 



subnet and finds out the vulnerability which can be only checked with the information 
of several hosts that are part of the distributed environments using Agent Report Ana- 
lyzer. It also checks whether the subnet obeys the security policy or not. 

Subnet Report Manager generates the subnet security evaluation result report that 
describes the information related to the vulnerability based on the security evaluation 
result from Subnet Check Engine. It stores the report in the Subnet Report DB and 
sends the report to the Domain Analyzer. 

3.3.4 Decision-Making about Security Management of the Enterprise Network 
Domain Using Domain Analyzer 

Fig. 5 illustrates the architecture of the Domain Analyzer. 
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Fig. 5. Architecture of the Domain Analyzer 

Rule Customizer of the Domain Analyzer selects the rules and policy that are nec- 
essary for Domain Analyzer from Security Evaluation Rule Manager and Security 
Policy Server, and stores them in the Customized Domain Rule DB. 

Security Management Decision Engine evaluates the security of the entire enter- 
prise network domain based on the Customized Domain Rules and security evaluation 
results from each Agents and Subnet Analyzers. It analyzes the subnet security 
evaluation reports from each subnet in the domain and finds out the vulnerability that 
can be checked with the information of several subnet using Subnet Report Analyzer. 
It also finds out the correlation between the vulnerability. It combines the duplicated 
results and removes wrong results from the evaluation results. It predicts the possible 













258 



J. S. Lee, S. C. Kim, and S. W. Sohn 



intrusion due to the vulnerability analyzed so far. Then it makes decision about secu- 
rity management to prevent the possible intrusion. 

Security Management Decision Report Manager generates the Security Manage- 
ment Decision Report based on the security management decisions from Security 
Management Decision Engine. It stores the report in the Security Management Deci- 
sion Report DB. A security manager can view the report using Manager Tool and use 
it for decision-making about security management of the network. 

Domain Analyzer notifies the decisions about security management to the Security 
Management System or to the Intrusion Detection System and helps them to prevent 
the possible intrusion. It requests proper system configuration changes to the Security 
Management System and it notifies the possible intrusion to the Intrusion Detection 
System. 

If signs of the serious hacking have been found, it sends a notification mail to the 
Information Security Agency. It also sends a notification mail to the system manager 
of the host that has the security weaknesses. 



3.3.5 Report View and System Management Using Manager Tool 

Fig. 6 illustrates the architecture of the Manager Tool. 
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Fig. 6. Architecture of the Manager Tool 

A security manager can configure each component of Security Evaluation System 
remotely using Manager Tool. He can centrally manage Security Evaluation Rules and 
Security Policy. Manager Tool helps the security manager to view the evaluation re- 
ports in various formats. 

User Interface calls the proper module to process the manager’s requests and dis- 
plays the result to the manager. A security manager can add, delete, or update the 
security policy of the Security Policy Sever using Security Policy Management Mod- 
ule. He can view the reports in various formats using Report View Module. He also 
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can add, delete, or update the security evaluation rules of the Security Evaluation Rule 
Manager. He can configure the system or view log using System Configuration & Log 
View module. 



4. Conclusions 

In this paper we analyze the requirements of the Security Evaluation System that 
automates the security evaluation of the enterprise network consists of various compo- 
nents and that supports decision-making about security management to prevent the 
intrusion, and we propose a design for it which satisfies the requirements. 

In our design. Security Evaluation System gets new vulnerability information from 
directory services frequently and updates the security evaluation rule based on it. 
Therefore Security Evaluation System can cope with the new vulnerability rapidly. 
The system analyzes the security of each host and subnet that are components of the 
enterprise network and it evaluates the security of the entire enterprise network do- 
main based on the analysis result of each component. Security Evaluation System then 
supports decision-making about security management based on the evaluation result to 
prevent the intrusion in advance. 

In the system design. Security analysis processes are distributed over several ana- 
lyzers such as Agents and Subnet Analyzers. Therefore the processing load of each 
analyzer is relatively small. Overall performance of Security Evaluation System can 
be preserved from decrease by adding new Agents and Subnet Analyzers to the net- 
work domain even though the number of hosts in the network domain increases. That 
is. Security Evaluation System is scalable. Security Evaluation System requires rela- 
tively small network traffic because not all the host or network information is sent to 
the Domain Analyzer. Agents or Subnet Analyzers send security evaluation reports to 
the Domain Analyzer and they contain only information related to the security evalua- 
tion result of the host or the subnet. It also greatly reduces the amount of data that the 
Domain Analyzer has to analyze. Therefore, Security Evaluation System is suitable 
for large enterprise network. 

Security Evaluation System evaluates the security of the network based on the secu- 
rity policy of the Security Policy Server. Therefore, it is easy to check whether the 
various components of the enterprise network obey the security policy of the network 
domain or not. 

Security Evaluation System references the security policy for the security evalua- 
tion but so far the research on the security policy system is not enough and further 
research is required. Artificial Intelligence technology such as Expert System has to be 
considered for the accurate and intelligent decision support for network security man- 
agement. AI technology is also needed for the prediction of the intrusion caused by 
specific vulnerability. 
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